Photo by Tanguy Sauvin on Unsplash


In this article, we’ll be learning what is CNN and implement one for Malaria Cell Image dataset. I’ve got the dataset from Kaggle.

Convolutional Neural Network (CNN) is a multilayer perceptron which is good at identifying patterns from datasets. It uses mathematics to extract out important features of the data to make further classification. As these networks are good with pattern recognition, they are mostly used with images. It could also work with other data but the condition is that data should be in sequence i.e. shuffling data should change its meaning.

Diving deep

To understand CNNs in detail, we need to understand two concepts:

  1. Convolutions
  2. Pooling


Convolution is a part of processing the image to read the pattern from it. This would be the type of layer inside our CNN. It uses a filter matrix to extract the most important features of the image. This filter matrix is mostly of size 3×3 but could be changed as per requirement. Later, multiply it with the image matrix using matrix multiplication. This visualized more clearly with the following diagram:


Filters are very useful when dealing with images. You can see more examples of such filters from here. See how these filter values change the aesthetics of the image and highlight particular patterns.



Pooling is used for decreasing the dimensionality of the image. It takes the most important pixels of the image and discards all the other pixels. The below image represents how MaxPooling works in our neural network.



See how it decreases the 4×4 matrix to a 2×2 with keeping information on important features.

Further multiple layers of convolutions and pooling is used to get the patterns. This also helps in decreasing the dimensions for feeding these images to dense layers ahead.

Into the code

We’ll begin with importing the libraries.



Our dataset contains two folders with different images, parasitized and uninfected. These images should be preprocessed before passing to the model. This step is crucial because it will have a major impact on the accuracy of the model.



Here, we looped through all the images in the directories and resized every image to 50 by 50. Add all these images to the data’s list and their respective labels to the label’s list.



Converting the data into a NumPy array for passing into the model and then shuffling these arrays.



Now, we’re separating training and testing images. Divide these image arrays with 255 for normalizing the vectors. But why?

The pixels in an image represents the values between 0 and 255. So dividing the vector with 255 will create values between 0 and 1 which is more normalized and easy for our devices.

Training the model

A convolutional neural network consists of multiple layers that learns through data step by step and pass weights to the next layers. It should consist of the following layers:


  1. Conv2D as for convolution layer
  2. MaxPooling2D as for decreasing the pixels of image
  3. Flatten for converting the result into a flattened array
  4. Dense layer with softmax activation for output

We can obviously add other layers if required but this is the standard format used while working with images.



Summary of this model looks like this:


Then, we need to compile our model with loss function, metrics, and optimizer. We’re using adam as optimizer and categorical_crossentropy as loss functions.



Finally, we’ll fit the model with the training images and labels.



This gives an accuracy of 99.11% at the end of 20 epochs. And gives a test accuracy of 96.11% which is really good. Let’s plot the graphs of accuracy and loss over time.