Intuitively Create CNN for Fashion Image Multi-class Classification

栏目: IT技术 · 发布时间: 4年前

Intuitively Create CNN for Fashion Image Multi-class Classification

Intuitively Create CNN for Fashion Image Multi-class Classification

Img adapted from Pixabay via link

In my previous article , I walked though how to build a Convolution Neural Network (CNN) for a binary image classification problem. In this article, I will create another CNN for retail marketing industry. What sets this article unique: different format of input data which requires different data processing methods, and different CNN architecture for multi-class classification. It is split into 6 parts.

  1. Problem statement
  2. Data processing
  3. Model building
  4. Model compiling
  5. Model fitting
  6. Model evaluation
  1. Problem statement

We are given a set of images from retail industry. The task is to create a CNN model to predict the label of a fashion image: 0 as T-shirt; 1 as Trouser; 2 as Pullover; 3 as Dress; 4 as Coat; 5 as Sandal; 6 as Shirt; 7 as Sneaker; 8 as Bag; 9 as Ankle boot.

The data we used is Fashion MINST dataset with 70, 000 images, of which 60,000 for training set, and 10,000 for test set. All images are in grayscale with 28 pixels in height and 28 pixels in width. Each pixel representing the darkness of the pixel ranges from 0 (black) to 255 (white).

Figure 1 is a snippet of the training data. Note, each row representing an image has an associated label and 784-pixel values.

Fig.1 A snippet of training data

First, read in training and test data and convert dataframe type to numpy array .

fashion_train_df = pd.read_csv(‘fashion-mnist_train.csv’,sep=’,’)
fashion_test_df = pd.read_csv(‘fashion-mnist_test.csv’, sep = ‘,’)
training = np.array(fashion_train_df, dtype = ‘float32’)
testing = np.array(fashion_test_df, dtype=’float32')

If you want to view the image in color or grayscale mode, try below:

i = random.randint(1,60000) #select random index from 1 to 60,000
plt.imshow( training[i,1:].reshape((28,28)) ) # reshape and plot the image
plt.imshow( training[i,1:].reshape((28,28)) , cmap = ‘gray’) # reshape and plot the image

Next, scale the independent variables, namely the pixels, between 0 and 1.

X_train = training[:,1:]/255
y_train = training[:,0]
X_test = testing[:,1:]/255
y_test = testing[:,0]

Then, split the training data into training and validation sets, with validation taking 20%. With validation set, the model will be evaluated on its ability to generalize prediction on new data.

X_train, X_validate, y_train, y_validate = train_test_split(X_train, y_train, test_size = 0.2, random_state = 12345)

Finally, we need to reshape X_train , X_validate , X_test . This is a critical point. Keras only accepts a special shape of input data for CNN, namely (batch size, pixel width, pixel height, number of colour channels). Therefore,

X_train = X_train.reshape((-1, 28, 28, 1))
X_test = X_test.reshape(X_test.shape[0], *(28, 28, 1))
X_validate = X_validate.reshape(X_validate.shape[0], *(28, 28, 1))

Note, two methods are used to reshape the data above, achieving the same goal. 1st method sets the 1st dimension for Numpy to infer, while 2nd defines the 1st dimension with an *.

Great, now the data is ready to train the model.

In general, building a CNN requires 4 steps: convolution, max pooling, flattening and full connection. Here we will build a CNN model with 2 convolution layers.

Fundamentally, CNN is based on convolution. In simple words, convolutions use a kernel matrix to scan a given image and apply a filter to obtain a certain effect, such as blurring and sharpening. In CNN, kernels are used for feature extraction to select the most important pixels of an image and meanwhile preserves the spatial relationship between pixels.

If you want detailed explanation on the concept, please check the previous article here . Feel free to explore this fantastic website to visualize how convolution works. Another great website is by Ryerson University. It visually and interactively shows how a CNN works.

classifier = Sequential()
classifier.add(Conv2D(64,3, 3, input_shape = (28,28,1), activation=’relu’))

Note, Number of feature detector is set to be 64, and the feature detector is a 3×3 array. input_shape is the shape of input images on which we apply feature detectors through convolution. We set it to be (28, 28, 1). Here, 1 is number of channel for a grayscale image, 28×28 is the image dimension in each channel. This needs to the same as the shape of X_train , X_test , X_validate .

Final argument is the activation function. we use ReLU to remove negative pixel values in feature maps. This is because depending on the parameters used in convolution, we may obtain negative pixels in feature maps. Removing negative pixels add non-linearity for a non-linear classification problem.

Max pooling is to reduce size of a feature map produced by convolution by sliding a table and taking the maximum value in the table. Ultimately, it aims to reduce the number of nodes in the fully connected layers without losing key features and spatial structure information in the images.

Specifically, we use MaxPooling2D() function to add the pooling layer. In general, we use a 2×2 table for pooling.

classifier.add(MaxPooling2D(pool_size = (2, 2)))

Dropout is the solution for over-fitting. How does drop out work? During each training iteration, some neurons are randomly disabled to prevent them from depending on each other too much. By overwriting these neurons, neural network retains a different architecture each time, helping neural network learn independent correlations of the data. This prevent the neurons over-learn. Specifically,

classifier.add(Dropout(0.25))

Note, we set 25% of neurons to disabled at each iteration.

3.4 Convolution & Max Pooling

Based on previous experiments, add a 2nd layer for convolution and max pooling to improve model performance.

classifier.add(Conv2D(32,3, 3, activation=’relu’))
classifier.add(MaxPooling2D(pool_size = (2, 2)))

Flattening is to take all reduced feature maps after pooling into a single vector as the input for the fully connected layers. Specifically,

classifier.add(Flatten())

With above, we converted an input image into a one-dimensional vector. Now let’s build a classifier using this vector as the input. Specifically,

classifier.add(Dense(output_dim = 32, activation = ‘relu’))
classifier.add(Dense(output_dim = 10, activation = ‘sigmoid’))

Note, for the 1st hidden layer, output_dim as the number of nodes in the hidden layer, is set to be 32. Please feel free to try more. Use ReLU as activation function.

With that done, congratulation for finishing the model building. Figure 2 is what we built.

Fig.2 CNN architecture diagram (Img created by Author)

With all layers added, let’s configure CNN for training. An important decision to make is the loss function. As advice, if one sample can have multiple classes or labels, use categorical_crossentropy . If classes are mutually exclusive (e.g. when each sample belongs exactly to one class), use sparse_categorical_crossentropy . Here use the latter.

classifier.compile(loss =’sparse_categorical_crossentropy’, optimizer=Adam(lr=0.001), metrics =[‘accuracy’])

Now the model is ready to be trained. We train the model for 50 iterations on the data. The model updates its gradients every 512 samples. Use ( X_validate , y_validate ) to evaluate the model loss and accuracy.

epochs = 50
history = classifier.fit(X_train, y_train, batch_size = 512, nb_epoch = epochs, verbose = 1, validation_data = (X_validate, y_validate))

At end, we obtained a training accuracy of 92% and test accuracy of 90% . Quite good results!

Now, let’s evaluate the model on test sets. Specifically,

evaluation = classifier.evaluate(X_test, y_test)

We obtained a test accuracy of 90% ! Figure 3 below shows a view of predicted and Real class of the images.

Fig.3 Predicted and True class comparison

Finally, if you want tune the model with much more data, feel free to explore this link . If you want to check more advanced Data Science Innovation in Retail industry, check this page .

Great! Huge congratulation to the end. Hopefully, this gives a sense of how to create a CNN for fashion image classification. If you need the source code, feel free to visit my Github page. Many thanks for your time!


以上所述就是小编给大家介绍的《Intuitively Create CNN for Fashion Image Multi-class Classification》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

网络英雄传

网络英雄传

郭羽、刘波 / 江苏凤凰文艺出版社 / 2018-6 / 59.80元

“商战鬼才郭羽、营销奇才刘波强强联手,凝集十年实战经验,倾力打造商战巨作。” 这是一个商业竞争和资本激战交织的惊心动魄的创业交锋故事。 由郭天宇、刘帅共同创立的在线旅游公司万全天盛凭借其出色的商业模式异军突起,与老牌巨头“51旅游网”两强相争,但国际巨头通远来势汹汹,国内在线旅游市场进入战火纷飞的“三国杀”时代,分踞杭、沪、京三地互相“搏杀”。中国新兴的互联网公司面对国际巨头的入侵,毫不退缩......一起来看看 《网络英雄传》 这本书的介绍吧!

MD5 加密
MD5 加密

MD5 加密工具

Markdown 在线编辑器
Markdown 在线编辑器

Markdown 在线编辑器

HEX CMYK 转换工具
HEX CMYK 转换工具

HEX CMYK 互转工具