Three Ways of Object Detection on Android

Three Ways of Object Detection on Android

This blog is associated with object detection and the use of TensorFlow, TensorFlow Lite (CPU & GPU), and ML Kit on Android devices. It facilitates us in detecting, locating, and monitoring an object from a photo or a video along with real-time item detection.

What is Object Detection?

It is a Computer Vision technique that can detect and locate objects from the given image or video. The object detection model can identify a set of objects present and provide information about their positions within the image.

What is TensorFlow?

TensorFlow is an open-source end-to-end platform for creating Machine Learning applications. It is a symbolic math library that uses dataflow and differentiable programming to perform various tasks focused on the training and inference of deep neural networks. By using it we can create machine learning applications using various tools, libraries, and community resources.

What is TensorFlow Lite?

TensorFlow Lite is TensorFlow’s lightweight solution for mobile and embedded devices. It let you run machine-learned models on mobile devices with low latency, so you can take advantage of them to perform classification, regression, and other things.

What is an ML Kit?

ML Kit is a cross-platform mobile SDK used to implement ML techniques in mobile applications by bringing Google's ML technologies, such as the Mobile Vision, and TensorFlow Lite, together in a single SDK. It offers both on-device and cloud-based APIs.

Three approaches we are going to use for Object Detection:

1. TensorFlow Object Detection:
The TensorFlow object detection API is a framework for creating a deep learning network that solves object detection problems. We will use the protobuf model (.pb) TensorFlow model for the same.

2. TensorFlow Lite Object Detection:
TensorFlow Lite is an open-source deep learning framework to run TensorFlow models on-device. We can detect objects from camera feeds with MobileNet models.

3. ML Kit Object Detection API:
With ML Kit's on-device Object Detection and Tracking API, you can detect and track objects in an image or live camera feed.

How to use the camera of an Android device?

First of all, we need to add camera and internet permissions in the AndroidManifest.xml file along with uses-feature.

For all three concepts we will use the CameraX library of Android.

Uses-case based approach of CameraX:

1. Preview: To get the real-time image on the display.

2. Image Analysis: You will get frame by frame luminosity of the image. It will help you send the data to ML Kit for machine learning and related image analysis tools, such as object detection.

3. Image capture: You can capture the image in high quality and save it.

For using CameraX in Android applications you need to add the following dependency in build.gradle (Module: app)

Object Detection using TensorFlow :
Why use TensorFlow?

The important motive is that for the use of protobuf models it's obligatory to apply TensorFlow Framework rather than TensorFLow Lite.

For using TensorFlow in Android applications you need to add following the dependency in build.gradle (Module: app)

Setting Up TensorFlow

There are three steps for setting up TensorFlow in our project:

1. Load your TensorFlow library in a separate class or in an activity where you want to perform object detection.

Before creating an instance of the TensorFlowInferenceInterface class we need to create some constants. They are as following:

  • MODEL_FILE - This variable stores the path of the .pb model and it must be stored in the assets folder
  • INPUT_NODE - Name of input node in our model
  • OUTPUT_NODES - An array of output nodes (It varies with the .pb model)
  • OUTPUT_NODE - Name of the output node in our model
  • INPUT_SIZE - Size of the input

2. Create an instance of TensorFlowInferenceInterface which is a class responsible for making an inference, that is, a prediction.

3. Now we create a function that takes bitmap as an input and stores the result of the model into the output array.

Here, a bitmap is converted from frames that are extracted from live camera feed or static video.

So this output array contains all the detected objects from a particular image which can be used to draw boxes over the detected object. To draw, utilise the Canvas class from the android.graphics package, which creates a rectangle on the overlay layout using coordinates for exact positioning.

Result:
Fig1. Objects detected using TensorFlow
Fig1. Objects detected using TensorFlow
Object Detection using TensorFlow Lite :
Why use TensorFlow Lite?

Applications developed on TensorFlow Lite will have better performance and less binary file size than TensorFlow mobile.TensorFlow Lite models have faster inference time and require less processing power, which provides faster performance in real-time applications.

TensorFlow Lite supports lots of hardware accelerators such as CPU, GPU, NNAPI.

There are two ways we can detect objects using TensorFlow Lite:
  • TensorFLow Lite CPU
  • TensorFlow Lite GPU Delegate

For using TensorFlow Lite in Android applications you need to add the following dependencies inbuild.gradle (Module: app)

TensorFlow Lite on CPU

1. First, we need to create an instance of class ObjectDetector which is a class responsible for detecting objects from images.

Here, variable MODEL_FILE contains the .tflite  model name.

2. Now we create a function to run object detection on a particular image. For that first we need to convert the Bitmap image to TensorImage and then call the detect() method of ObjectDetector for detecting objects.

Here, the results list contains all the detected objects from a particular image which can be used to draw boxes over the detected object using the Canvas class.

Result:
Fig2. Objects detected using TensorFlow Lite CPU
Fig2. Objects detected using TensorFlow Lite CPU
TensorFlow Lite on GPU

There are many advantages of using GPU acceleration such as:

  • Speed - GPUs are designed to have high throughput for massively parallelizable workloads.
  • Accuracy - GPUs perform computation with 16-bit or 32-bit floating point numbers and (unlike the CPUs) do not require quantization for optimal performance.
  • Energy efficiency - A GPU carries out computations in a very efficient and optimized way, consuming less power and generating less heat than the same task run on a CPU.

Now, we will learn about how to use the GPU backend using the TensorFlow Lite delegate APIs on Android.

Steps for GPU implementation:

1. First, we need to create an instance of the Interpreter class, which is responsible for object detection.

Here, variable MODEL_FILE contains the .tflite  model name.

loadModelFile() method is used to load a model from assets folder in Android.

2. Now we need to create a function that provides output with an array of detected objects.

Our tflite model returns multiple outputs hence we need to use the runForMultipleInputsOutputs method with a map of arrays to store the results.

3. Draw the bounding boxes on the bitmap with the coordinates we got from the results, we can also display the object names which are defined as classes in the output array.

Result:
Fig3. Objects detected on video using TensorFlow Lite GPU
Fig3. Objects detected on video using TensorFlow Lite GPU
Fig4. Objects detected on live camera feed using TensorFlow Lite GPU
Fig4. Objects detected on live camera feed using TensorFlow Lite GPU
Object Detection using ML Kit :

ML Kit is a part of Google’s Vision APIs, which has widely used ML models including Object detection. You can directly use ML kit’s default model for object detection as described here or use your custom tflite model, which we are going to implement here.

Add the following dependency in build.gradle (Module: app)

Steps to configure ML Kit in Android:

1. Configure the object detector

Here, MODEL_FILE contains the custom model name stored in the assets folder.

2.Now to detect an object we need to pass an image to the ObjectDetector instance’s process( ) method.

If the process() method succeeds, a list of DetectedObjects is passed to the success listener.

Here, each detected object contains the following properties :

  • Bounding box - A Rect that provides a position of the object in the image.
  • Tracking ID - An integer that identifies the object across images. Null in SINGLE_IMAGE_MODE.
  • Labels - It contains label description, label index, label confidence.

To sum up, these properties help to label every object and to draw bounding boxes over the detected objects.

Result:
Fig5. Objects detected on live camera feed using ML Kit Vision API
Fig5. Objects detected on live camera feed using ML Kit Vision API

And that’s about it! These are the ways we can implement Object Detection on Android mobile devices using three different ML libraries: TensorFlow, TensorFlow Lite, and ML Kit.