The source code to this project is now available in full, at: http://imrannazar.com/content/files/android-sobel.zip
In the previous part of this set of articles, I began an introduction to augmented reality, using the simple example of edge detection on Android smartphones; in that part, the camera hardware was introduced, and the framework of an application developed for the use of the camera preview. In this concluding part, the edge detection algorithm itself and its implementation will be explored.
The Sobel operator
The algorithm that will be used is the Sobel operator, which works as a filter applied to each pixel in an image. The process iterates over each pixel in a row of the image, and over each row in turn, performing a factorised multiplication for each pixel value:
The calculation of the Sobel operator index can be simplified in two ways:
- Removal of multiplication: Some of the indices used by the algorithm are zero, which means that the associated terms are not used in the calculation at all; conversely, some indices are negative, which means a negative value must be added. Replacing multiplications with addition and subtraction of terms means that fewer operations are required to produce the value, making the calculation quicker.
- Approximation of Pythagorean addition: For the purposes of this application, an exact value for the resultant Sobel value is not required, merely an approximation; a relatively close approximation of the Pythagorean operator is a simple average of the two values involved. This average will always be higher than the actual value, but will serve as a fair replacement.
With these modifications, the calculation can be adapted to the following.
Before this filter can be applied to the camera preview image, the image must be taken from the camera and made ready for processing.
Handling the Camera Preview
As introduced in Part 1, the camera hardware is capable of automatically calling a predefined function whenever a frame of the preview is ready; this function is referred to as the "preview callback", and receives a
byte containing the raw image data. By default, the preview image is in
NV21 format, a standard luminance/chrominance format; for the example of a 320x240 pixel NV21 image:
- The first 76,800 bytes of the image are a direct luminance map, with each byte corresponding to a "brightness" or greyscale value for the corresponding pixel in the image;
- The following 38,400 bytes are a 2x2 subsampling of chrominance: for each 2x2-pixel block in the image, one byte encodes a U-chrominance, and the following byte a V-value.
It's relatively straightforward to perform a Sobel calculation on the luminance part of the NV21 image, and a thresholded result can be placed into the overlay canvas for each output pixel:
src/sobel/OverlayView.java: Sobel operation
The above code, when run as part of the camera preview, yields the following view.
Optimising the operation
As written, there's a problem with this application: speed. When run on a hardware device, the overlay calculation is incapable of maintaining a near-real-time speed of augmented display; in the case of my own hardware, a rendering speed of around 3 frames per second was achieved. This is due, in the main, to the calculations being performed within a buffer of managed memory in the Dalvik virtual machine: every access to the camera preview data is checked for boundary conditions, as is every pixel value written to the overlay canvas. All of these checks for boundary conditions take time away from the Sobel operation.
To alleviate this issue, the calculation can be performed in native code bypassing the virtual machine; this is done through the Android Native Development Kit (NDK). The NDK is an implementation of the Java Native Interface (JNI), and as such behaves in a very similar way to standard JNI: native code is placed into functions conforming to a particular naming standard, and they can then be called from the Java VM as specially marked
NDK native functions are named according to the package and class they're destined for: the standard format is
Java_<package>_<class>_<function>. In this particular case, the destination is package
sobel and class
OverlayView, so the interface can be built as below.
jni/native.c: NDK processing interface
src/sobel/OverlayView.java: Native function definition
Note that in the above code, the
int array used beforehand for overlay output has been replaced by an
IntBuffer; this is to allow access to the raw memory buffer for native work, since a standard
int has memory allocated by the JVM, and cannot be written to by the JNI.
Buffers are designed to allow direct access to the buffer memory through the object's
GetDirectBufferAddress function, which we can use for writing the output of the Sobel operation.
The Java code shown above for the operation can be translated directly to C code, as below:
jni/native.c: Sobel implementation
src/sobel/OverlayView.java: Calling the native function
Once the Java code has been configured to call the native function for processing, the lack of extraneous work by the JVM results in a significant speed-up: under testing on my hardware, a speed of 15-20 frames per second was easily achievable, and this can be improved through further optimisation of the algorithm.
The Android documentation for the NDK states:
"Using native code does not result in an automatic performance increase, but always increases application complexity."
In the case of the memory-intensive processing presented here, the NDK has a significant advantage over the Java virtual machine, in that it doesn't perform bounds checking on array and pointer accesses. Since most augmented reality applications will need to work on the camera preview image, and provide an overlay on top of the preview, the technique of shunting processing into an NDK function can be useful.
Imran Nazar <email@example.com>, May 2011.
Article dated: 21st May 2011