MobileNet is a lightweight convolutional neural network architecture designed for mobile and embedded vision applications[1][3]. It was developed to address the need for efficient models that can run on devices with limited computational resources while maintaining reasonable accuracy.
The key innovation of MobileNet is the use of depthwise separable convolutions, which factorize a standard convolution into a depthwise convolution and a 1×1 pointwise convolution[1][3]. This significantly reduces the number of parameters and computational cost compared to traditional convolutional neural networks.
MobileNet’s architecture consists of:
- An initial full convolution layer
- 13 depthwise separable convolution blocks
- Average pooling layer
- Fully connected layer
- Softmax classifier[1][4]
The network employs batch normalization and ReLU activation functions after each convolution[4]. Downsampling is achieved through strided convolutions in the first layer and in some of the depthwise convolution layers[4].
MobileNet introduces two hyperparameters to further tune the model size and latency:
- Width multiplier: Thins the network uniformly at each layer
- Resolution multiplier: Reduces the input image resolution[1]
These allow developers to choose the right trade-off between accuracy, latency, and model size for their specific use case.
MobileNet has proven effective for various computer vision tasks, including:
- Image classification
- Object detection
- Face recognition
- Pose estimation[3]
Its efficiency and flexibility have made it a popular choice for mobile and embedded applications, where computational resources are constrained.
Further Reading
1. models/research/slim/nets/mobilenet_v1.md at master · tensorflow/models · GitHub
2. open_model_zoo/models/public/mobilenet-v2-1.4-224/README.md at master · openvinotoolkit/open_model_zoo · GitHub
3. MobileNetV1 Explained | Papers With Code
4. https://scholarworks.indianapolis.iu.edu/server/api/core/bitstreams/a7fbc815-0f25-480a-bce1-0cb231238b66/content
5. A New Image Classification Approach via Improved MobileNet Models with Local Receptive Field Expansion in Shallow Layers – PMC