Tag Archives: augmented reality

Sensor-based Augmented Reality made simple

I did some content-based augmented reality for Android and my former student developed a sensor-based Augmented Reality App. Thus, I thought I should be able to do the sensor-based stuff as well. I fiddled around a lot to make it work with the canvas but finally I realized that I’m just not able to do it with the Canvas and switched to OpenGL. I attached an Eclipse project with the source code.

Even though I couldn’t find a good example or tutorial it was pretty easy and definitely much easier than going the Canvas way. Basically you have to use the SensorManager to register for accelerometer and magnetometer (that’s the compass) events. You find the code in the class PhoneOrientation. Accelerometer data and compass data can be combined to create a matrix using the code below. I also had to “remap the coordinate system” because by the example uses a portrait mode.

SensorManager.getRotationMatrix(newMat, null, acceleration, orientation);
SensorManager.remapCoordinateSystem(newMat,
		SensorManager.AXIS_Y, SensorManager.AXIS_MINUS_X,
		newMat);

The newMat is a 4×4 matrix as a float array. This matrix must be passed to the OpenGL rendering pipeline and loaded by simply using:

gl.glMatrixMode(GL10.GL_MODELVIEW);
gl.glLoadMatrixf(floatMat, 0);

That’s it basically. As I never learned how to use OpenGL, in particular how to load textures, the project is based on an earlier example that renders the camera image on a cube. The project also uses an Android 2.2 API and reflection to access camera images in a fast way (that’s why it works on Android 2.1). Check out the Eclipse project if you are interested or install the demo on you Android 2.1 device (on cyrket/in the market).

Beeing the off-screen king

Recently Torben and I spammed the “International Conference on Human-Computer Interaction with Mobile Devices and Services” (better known as MobileHCI) with two papers and a poster about off-screen visualizations. Off-screen visualizations try to reduce the impact of the immanent size restrictions of mobile devices’ display. The idea is that the display is just a window in a larger space. Off-screen visualizations show where the user should look for objects located in this larger space.

The title of the first paper is Visualization of Off-Screen Objects in Mobile Augmented Reality. It deals with displaying points-of-interests using sensor-based mobile augmented reality. We compare the common mini-map that provides a 2D overview about nearby object with the more uncommon visualization of nearby objects using arrows that point at the objects. The images below show both visualizations side-by-side.

off-screen visualizations for handheld augmented reality

To compare the mini-map with the arrows we conducted a small user study in the city centre. We randomly asked passersby to participate in our study (big thanks to my student Manuel who attracted 90% of our female participants). We ended up with 26 people testing both visualizations. Probably because most participants where non tech-savvy guys the collected data is heavily affected by noise. From the results (see the paper for more details) we still conclude that our arrows outperform the mini-map. Even though the study has some flaws I’m quite sure that our results are valid. However, we only tested a very small number of objects and I’m pretty sure that one would get different results for larger number of objects. I would really like to see a study that analyzes a larger number of objects and additional visualizations.

In the paper Evaluation of an Off-Screen Visualization for Magic Lens and Dynamic Peephole Interfaces I compared a dynamic peephole interface with a Magic Lens using an arrow-based off-screen visualization (or no off-screen visualization). The idea of dynamic peephole interfaces is that the mobile phone’s display is a window to a virtual surface. You explore the surface by physically moving your phone around (e.g. a digital map). The Magic Lens is very similar with the important difference that you explore a physical surface (e.g. a paper map) that is augmented with additional information. The concept of the Magic Lens is sketched in the Figures below.

handheld augemented reality with paper mapsConceptual sketch of using a Magic Lens to interact with a paper map.

We could measure a difference between the Magic Lens and the dynamic peephole interface. However, we did measure a clear difference between using an off-screen visualization or not. I assume that the impact of those off-screen visualizations has a much larger impact on the user experience than using a Magic Lens or the dynamic peephole. As the Magic Lens relies on a physical surface I doubt that it has a relevant value (for the simple tasked we tested – of course).

As some guys asked me why I use arrows and not those fancy Halos or Wedges (actually I wonder if someone ever fully implemented Wedge for an interactive application) I thought it might be nice to be able to cite my own paper. Thus, I decided to compare some off-screen visualizations techniques for digital maps (e.g. Google maps) on mobile phones. As it would’ve been a bit boring to just repeat the same study conducted by Burigat and co I decided to let users interact with the map (instead of using a static prototype). To make it a bit more interesting (and because I’m lazy) we developed a prototype and published it to the Android Market. We collected some data from users that installed the app and completed an interactive tutorial. The results indicate that arrows are just better than Halos. However, our methodology is flawed and I assume that we haven’t measured what we intended to measure. You can test the application on you Android Phone or just have a look at the poster.

Screenshots of our application in the Android Market

I’m a bit afraid that the papers will end up in the same session. Might be annoying for the audience to see two presentations with the same motivation and similar related work.

Goodbye garbage collector – patching Android to make real-time camera image processing feasible

If you want to process camera images on Android phones for real-time object recognition or content based Augmented Reality you probably heard about the Camera Preview Callback memory Issue. Each time your Java application gets a preview image from the system a new chunk of memory is allocated. When this memory chunk gets freed again by the Garbage Collector the system freezes for 100ms-200ms. This is especially bad if the system is under heavy load (I do object recognition on a phone – hooray it eats as much CPU power as possible). If you browse through Android’s 1.6 source code you realize that this is only because the wrapper (that protects us from the native stuff) allocates a new byte array each time a new frame is available. Build-in native code can, of course, avoid this issue.

I still hope someone will fix the Camera Preview Callback memory Issue but meanwhile I fixed it, at least for my phone, to build prototypes by patching the Donut’s (Android 1.6) source code. What you find below is just an ugly hack I did for myself! To reproduce it you should know how to compile Android from source.

Avoid memory allocation

Diving in the source code starts with the Java Wrapper of the Camera and its native counterpart android_hardware_Camera.cpp. A Java application calls setPreviewCallback, this method calls the native function android_hardware_Camera_setHasPreviewCallback, and the call is passed further down into the system. When the driver delivers a new frame towards the native wrapper in return it ends up in the function JNICameraContext::copyAndPost():

void JNICameraContext::copyAndPost(JNIEnv* env, const sp& dataPtr, int msgType)
{
    jbyteArray obj = NULL;

    // allocate Java byte array and copy data
    if (dataPtr != NULL) {
        ssize_t offset;
        size_t size;
        sp heap = dataPtr->getMemory(&offset, &size);
        LOGV("postData: off=%d, size=%d", offset, size);
        uint8_t *heapBase = (uint8_t*)heap->base();

        if (heapBase != NULL) {
            const jbyte* data = reinterpret_cast(heapBase + offset);
            obj = env->NewByteArray(size);
            if (obj == NULL) {
                LOGE("Couldn't allocate byte array for JPEG data");
                env->ExceptionClear();
            } else {
                env->SetByteArrayRegion(obj, 0, size, data);
            }
        } else {
            LOGE("image heap is NULL");
        }
    }

    // post image data to Java
    env->CallStaticVoidMethod(mCameraJClass, fields.post_event,
            mCameraJObjectWeak, msgType, 0, 0, obj);
    if (obj) {
        env->DeleteLocalRef(obj);
    }
}

The evil bouncer is the line obj = env->NewByteArray(size); which allocates a new Java byte array each time. For a frame with 480×320 pixels that means 230kb per call and that takes some time. Even worse this buffer must be freed later on by the Garbage Collector which takes even more time. Thus, the task is to avoid these allocations. I don’t care about compatibility with existing applications and want to keep the changes minimal. What I did is just a dirty hack but works for me quite well.

My approach is to allocate a Java byte array once and reuse it for every frame. First I added the following three variables to android_hardware_Camera.cpp:

static Mutex sPostDataLock; // A mutex that synchronizes calls to sCameraPreviewArrayGlobal
static jbyteArray sCameraPreviewArrayGlobal; // Buffer that is reused
static size_t sCameraPreviewArraySize=0; // Size of the buffer (or 0 if the buffer is not yet used)

To actually use the buffer I change the function copyAndPost by replacing it with the following code:

void JNICameraContext::copyAndPost(JNIEnv* env, const sp& dataPtr, int msgType) {
    if (dataPtr != NULL) {
        ssize_t offset;
        size_t size;
        sp heap = dataPtr->getMemory(&offset, &size);
        LOGV("postData: off=%d, size=%d", offset, size);
        uint8_t *heapBase = (uint8_t*)heap->base();

        if (heapBase != NULL) {
            const jbyte* data = reinterpret_cast(heapBase + offset);
            //HACK
            if ((sCameraPreviewArraySize==0) || (sCameraPreviewArraySize!=size)) {
                if (sCameraPreviewArraySize!=0) env->DeleteGlobalRef(sCameraPreviewArrayGlobal);
                sCameraPreviewArraySize=size;
                jbyteArray mCameraPreviewArray = env->NewByteArray(size);
                sCameraPreviewArrayGlobal=(jbyteArray)env->NewGlobalRef(mCameraPreviewArray);
                env->DeleteLocalRef(mCameraPreviewArray);
            }
            if (sCameraPreviewArrayGlobal == NULL) {
                LOGE("Couldn't allocate byte array for JPEG data");
                env->ExceptionClear();
            } else {
                env->SetByteArrayRegion(sCameraPreviewArrayGlobal, 0, size, data);
            }
        } else {
            LOGE("image heap is NULL");
        }
    }
    // post image data to Java
    env->CallStaticVoidMethod(mCameraJClass, fields.post_event, mCameraJObjectWeak, msgType, 0, 0, sCameraPreviewArrayGlobal);
}

If the buffer has the wrong size a new buffer is allocated. Otherwise the buffer is just reused. This hack has definitely some nasty side effects in common situations. However, to be nice we should delete the global refference to our buffer when the camera is released. Therefore, I add the following code to the end of android_hardware_Camera_release:

if (sCameraPreviewArraySize!=0) {
    Mutex::Autolock _l(sPostDataLock);
    env->DeleteGlobalRef(sCameraPreviewArrayGlobal);
    sCameraPreviewArraySize=0;
}

Finally, I have to change the mutex used in the function postData. The Java patch below avoids passing the camera image to another thread. Therefore, the thread that calls postData is the same thread that calls my Java code. To be able to call camera functions from that Java code I need another mutex for postData. Usually the mutex mLock is used through the line: Mutex::Autolock _l(mLock); and I replace this line with Mutex::Autolock _l(sPostDataLock);.

Outsmart Android’s message queue

Unfortunately this is only the first half our customization. Somewhere deep inside the system probably at the driver level (been there once – don’t want to go there again) is a thread which pumps the camera images into the system. This call ends up in the Java code of Camera.java. Thereby the frame is delivered to the postEventFromNative method inside Camera.java. However, afterwards the frame is not delivered directly to our application but takes a detour via Android’s message queue. This is pretty ugly if we reuse our frame buffer. The detour makes the process asynchronous. Since the buffer is permanently overwritten this leads to corrupted frames. If you want to avoid this detour this must be changed. The easiest solution (for me) is to take the code snippet that handles this callback from the method handleMessage:

            case CAMERA_MSG_PREVIEW_FRAME:
                if (mPreviewCallback != null) {
                    mPreviewCallback.onPreviewFrame((byte[])msg.obj, mCamera);
                    if (mOneShot) {
                        mPreviewCallback = null;
                    }
                }
                return;

and move it to the method postEventFromNative.

	if (what==CAMERA_MSG_PREVIEW_FRAME) {
                if (c.mPreviewCallback != null) {
                    c.mPreviewCallback.onPreviewFrame((byte[])obj, c);
                    if (c.mOneShot) {
                        c.mPreviewCallback = null;
                    }
                }
                return;
	}

This might have some nasty side effects in some not so specific situations. If you done all that you might want to join the discussion about Issue 2794 and propose an API change in the Camera API: Excessive GC caused by preview callbacks thread to find a proper solution for the Camera Preview Callback memory Issue (and leave a comment here if you have a better solution).

Camera image->NDK->OpenGL texture

Since we are currently working on some augmented reality stuff for Android I need to show the camera image using OpenGL ES. It works great with pure Java if one uses only the grayscale image. However, I needed the color image. The G1’s camera delivers the image in a YUV format while OpenGL only understand RGB images. Unfortunately it is out of question to convert the YUV image to RGB in pure Java for images with 480×320 pixels. Thus, I used the NDK to implement the conversion. The code below does the job. It is based on code provided by Tom Gibara.

void toRGB565(unsigned short *yuvs, int widthIn, int heightIn, unsigned int *rgbs, int widthOut, int heightOut) {
  int half_widthIn = widthIn >> 1;

  //the end of the luminance data
  int lumEnd = (widthIn * heightIn) >> 1;
  //points to the next luminance value pair
  int lumPtr = 0;
  //points to the next chromiance value pair
  int chrPtr = lumEnd;
  //the end of the current luminance scanline
  int lineEnd = half_widthIn;

  int x,y;
  for (y=0;y> 1;
    for (x=0;x> 8) & 0xff;
      Y1 = Y1 & 0xff;
      int Cr = yuvs[chrPtr++];
      int Cb = ((Cr >> 8) & 0xff) - 128;
      Cr = (Cr & 0xff) - 128;

      int R, G, B;
      //generate first RGB components
      B = Y1 + ((454 * Cb) >> 8);
      if (B < 0) B = 0; if (B > 255) B = 255;
      G = Y1 - ((88 * Cb + 183 * Cr) >> 8);
      if (G < 0) G = 0; if (G > 255) G = 255;
      R = Y1 + ((359 * Cr) >> 8);
      if (R < 0) R = 0; if (R > 255) R = 255;
      int val = ((R & 0xf8) << 8) | ((G & 0xfc) << 3) | (B >> 3);

      //generate second RGB components
      B = Y1 + ((454 * Cb) >> 8);
      if (B < 0) B = 0; if (B > 255) B = 255;
      G = Y1 - ((88 * Cb + 183 * Cr) >> 8);
      if (G < 0) G = 0; if (G > 255) G = 255;
      R = Y1 + ((359 * Cr) >> 8);
      if (R < 0) R = 0; if (R > 255) R = 255;
      rgbs[yPosOut+x] = val | ((((R & 0xf8) << 8) | ((G & 0xfc) << 3) | (B >> 3)) << 16);
    }
    //skip back to the start of the chromiance values when necessary
    chrPtr = lumEnd + ((lumPtr  >> 1) / half_widthIn) * half_widthIn;
    lineEnd += half_widthIn;
  }
}

The code is not that optimized at the moment but can process a 480×320 image in ~25ms on my G1 (which is somewhat slow according to my student’s comments). In order to call this function from Java I needed a wrapper with a JNI signature:

/**
 * Converts the input image from YUV to a RGB 5_6_5 image.
 * The size of the output buffer must be at least the size of the input image.
 */
JNIEXPORT void JNICALL Java_de_offis_magic_core_NativeWrapper_image2TextureColor
  (JNIEnv *env, jclass clazz,
  jbyteArray imageIn, jint widthIn, jint heightIn,
  jobject imageOut, jint widthOut, jint heightOut,
  jint filter) {

	jbyte *cImageIn = (*env)->GetByteArrayElements(env, imageIn, NULL);
	jbyte *cImageOut = (jbyte*)(*env)->GetDirectBufferAddress(env, imageOut);


	toRGB565((unsigned short*)cImageIn, widthIn, heightIn, (unsigned int*)cImageOut, widthOut, heightOut);

	(*env)->ReleaseByteArrayElements(env, imageIn, cImageIn, JNI_ABORT);
}

To make it more interesting I added some filter to the camera image. There is a demo app in the market (direct link to the market). I tried to make the whole thing portable but would love to know if it works on other devices like the Motorola Milestone.
Sepia effectBlack & White effectFisheye effectInvert effect

Push the study to the market

My student Torben has just published his Android augmented reality app SINLA in the Android market. Our aim is to not only publish a cool app but to also use the market for a user study. The application is similar to Layar and Wikitude but we believe that the small mini-map you find in existing application (the small map you see in the lower right corner in the image below) might not be the best solution to show the users objects that are currently not in the focus of the camera.

We developed a different visualization for what we call “off-screen objects” that is inspired by off-screen visualizations for digital maps and navigation in virtual reality. It based on arrows pointing towards the objects. The arrows are arranged on a circle in a 3D perspective. Check out the image below to get an impression how it looks.

Its our first try to use a mobile market to get feedback from real end users. We compare our visualization technique with the more traditional mini-map. We collect only very little information from users at the moment because we’re afraid that we might deter users from providing any feedback at all. However, I’m thrilled to see if we can draw any conclusion from the feedback we get from the applications. I assume that this is a new way to do evaluations which will become more important in the future.