ImageNet训练完整流程

2018年08月29日 23:16:10 SrdLaplaceGua 阅读数：7438

下载数据

训练集(138G)

验证集(6.3G-50000张)

train_label.txt

validation_label.txt

p.s. 用迅雷下还挺快的，3天搞定

数据解压

tar xvf ILSVRC2012_img_train.tar -C ./train

tar xvf ILSVRC2012_img_val.tar -C ./val

对于train数据集，解压后是1000个tar文件，需要再次解压，解压脚本unzip.sh如下

dir=/data/srd/data/Image/ImageNet/train

for x in `ls $dir/*tar`
do
    filename=`basename $x .tar`
    mkdir $dir/$filename
    tar -xvf $x -C $dir/$filename
done

rm *.tar

使用数据集

下载好的训练集下的每个文件夹是一类图片，文件夹名对应的标签在下载好标签文件meta.mat中，这是一个matlab文件，scipy.io.loadmat可以读取文件内容，验证集下是5000张图片，每张图片对应的标签在ILSVRC2012_validation_ground_truth.txt中。
数据增强：取图片时随机取，然后将图片放缩为短边为256，然后再随机裁剪224×224的图片，再把每个通道减去相应通道的平均值，随机左右翻转

神经网络模型选择

因为DenseNet实现过了，这次来玩一玩ResNeXt和Inception-ResNet-v2：

ResNeXt：感觉看网上代码实现都有点问题，split通道感觉都和原文的意思不符，而且我训练了一下cifar-100结果和论文中的结论也不一样，所以就按自己的理解搞了一个，在imagenet上训练结果和原文比较吻合

blocks of ResNeXt: 256d(in)-(256,1×1,128)-(3×3,32x4d)-(128,1×1,256)-256d(out)
Downsampling is done by stride-2 convolutions in the 3×3 layer of the first block in each stage.（shortcut用stride-2的2×2的平均池化）
The identity shortcuts can be directly used when the input and output are of the same dimensions. When the dimensions increase, we consider two options: (A) The shortcut still performs identity mapping, with extra zero entries padded for increasing dimensions. This option introduces no extra parameter; (B) The projection shortcut is used to match dimensions (done by 1×1 convolutions). For both options, when the shortcuts go across feature maps of two sizes, they are performed with a stride of 2.（我采用了直接补0通道的方式）

result: 50-layer

top 5 acc: 0.92708
top 1 acc: 0.7562

Inception-ResNet-v2：照着论文撸，三种block两种Reduction还有stem这几个模块

转载请注明：SuperIT » ImageNet训练完整流程