The technology has become so much advanced with the advancement of Artificial Intelligence automating tasks and solving issues making our life simpler and easier. For such advancement, it becomes necessary for our technology to understand the images and process them. That’s when CNN comes into the picture. CNN [ Convolutional Neural Network ] is a kind of artificial neural network used as an ideal solution for most computer vision and image analysis-related problems such as image recognition and image classification. For example, To explain CNN in simple terms the computer trains itself to identify and recognize the image the same way a parent teaches his small child to recognize objects. Before moving into how to optimize your model let’s understand the basic CNN architecture and its components.
A CNN network consists of three layers: A convolutional layer, a pooling layer, and a fully convoluted layer.
The major building elements are convolutional layers. The layers consist of input vectors such as images and filters. The input layer holds the input image. The input is convolved by convolutional layers which pass this to the next layer. Each neuron present in the convolutional layer processes the data. These convolutional layers filter the input data and extract the information from the image. The purpose of the pooling layer is to reduce the size of the volume which makes the computation fast reduces the memory and prevents overfitting. A pooling layer is inserted between the converts. There are two common types of pooling mostly used Max-pooling and Average pooling. In max-pooling, the maximum value from a given part of the image covered by the kernel is obtained. Average pooling returns the average value from the portion of the image covered by the kernel.
In the end, there is a fully connected layer of neurons. This layer takes the input of all the previous layers and computes the class score. After training the feature vector from the fully connected layer is used to classify images into different categories. It helps to map the representation between the input and the output.
Choosing the parameters from the above-given points depends on the type of dataset you are using and the type of problem you are solving such as object detection, classification, or segmentation. Apart from these, many parameters need to be considered which can help in optimizing the model.