What is CNN in Deep Learning? Architecture, Working & Applications

 Convulutional Neural Network(CNN)

CNN is Extension of MLP(Multi Layer Perceptron)

What is CNN?

CNN is a specialize neural network which mainly designed for image processing and visula pattern recognation and computer vision work.

Why CNN is important?

  • Handle large image datasets efficiently
  • Automatically extracts features
  • Reduces number of parameters
  • Provides high accuracy in vision tasks

Structure of CNN:

  • Input layer: Receive image
  • Convolution layer: 
    • Feature detection using filter/cornel
    • Detect edge, corner, texture
    • Scane full image using filter slide
  • Active function (ReLU):
    • ReLU(Rectified Linear  Unit) f(x) = max(0, x)
    • Add non-linearity
    • Make zero of negative value
  • Polling layer:
    • Max polling / average pooling 
    • Dimension reduction
    • Reduce competition
    • Prevent overfitting
  • Fully connected layer
    • Like traditional neural network
    • Make final classification
  • Softmax layer
    • Used in  classification tasks
Uploading: 530976 of 530976 bytes uploaded.



Real world applications of CNN

  • Face recognition (Facebook, iPhone Face ID)
  • Object detection (YOLO, R-CNN)
  • Medical imaging (tumor detection)
  • Self-driving cars
  • Security surveillance
  • Handwriting recognition
  • License plate detection
  • Agriculture disease detection
  • Retail product recognition
  • Satellite image analysis

Popular CNNs

1. LeNet-5 (1998): 

Proposed by: Yann LeCun
  • One of the earliest CNN architectures
  • Designed for handwritten digit recognition (MNIST)
  • Uses convolution + pooling + fully connected layers
  • Simple but historically important
Use Case: Bank cheque digit recognition

2. AlexNet (2012)

Developed by: Alex Krizhevsky
  • Winner of ImageNet 2012 competition
  • Much deeper than LeNet
  • Introduced ReLU activation
  • Used GPU training
Impact: Started the deep learning revolution in computer vision.



3.  VGGNet (2014)

Developed by: Oxford Visual Geometry Group
  • Variants: VGG16, VGG19
  • Uses very small 3×3 filters
  • Deep but simple architecture
Pros: High accuracy
Cons: Very heavy (large parameters)


Developed by: Google
  • Introduced Inception module
  • Performs multiple convolutions in parallel
  • More efficient than VGG
Advantage: High performance with fewer parameters

5. ResNet (2015)

Developed by: Microsoft
  • Introduced Residual Connections (Skip Connections)
  • Solved vanishing gradient problem
  • Can train very deep networks (50, 101, 152 layers)
Famous Variants:
  • ResNet50
  • ResNet101



6. DenseNet (2017)
  • Each layer connects to every other layer
  • Improves feature reuse
  • Reduces vanishing gradient
Benefit: Efficient parameter usage

7. MobileNet (2017)

Designed for: Mobile & embedded devices
  • Lightweight CNN
  • Uses depthwise separable convolution
  • Fast and low computation
Use Case: Mobile vision apps

8. EfficientNet (2019)
  • Scales depth, width, and resolution systematically
  • Achieves high accuracy with fewer parameters
Popular Variants:
EfficientNet-B0 → B7

9. YOLO (You Only Look Once)
  • Real-time object detection CNN
  • Processes image in one pass
  • Very fast

No comments

Powered by Blogger.