Abstract
Android OS is filled with undocumented behavior and quirks that are taught the hard way; by discovery and by experience. Once you start playing around with the NDK, OpenGL and Android lifecycle events in multiple threads, things can become real hairy.
In this post, I’m going to list a number of things that I do not want to forget about my old buddy Android OS. Partly because I don’t want other people to make the same mistakes, but also because there are not a lot of logical threads to remember them by.
I will reference to how we solved Android specific main loop issues of Devil’s Attorney, which is available on Google Play here.
Background
A main loop is where you process user input, update the state of the game, and where you issue your draw calls to OpenGL before you swap the front buffer with the back buffer. When porting a cross-platform game like Devil’s Attorney to Android, most of the platform specific code lies in how the main loop is set up and how OS events are handled.
On Android this is typically done by either using a GLSurfaceView, which has the following drawbacks:
- Rendering thread in Java, which forces you to use Java thread locking mechanisms
- No low level control of the EGL context, everything handled by this do-everything class
or by using NativeActivity, which has the following drawbacks:
- Just a NDK layer around a regular Android Activity
- Suffers from the disadvantage of not being a pure Java Activity handling Android SDK specific tasks, such as playing video, playing streamed audio or starting other activities.
GLSurfaceView was a deal breaker for me, because it relies on the virtual machine Dalvik to schedule the rendering thread. Setting this aside, Android OS may destroy parts of the GLSurfaceView on the main thread, requiring you to do awkward thread locking with the rendering thread at certain points. This may be all right in Java, but in conjunction with native calls through JNI it can become complex really fast. This also holds true for handling touch events and other kinds of input to the game, and for a game running at 60hz you want to minimize usage of Java.
NativeActivity seems like a blessing at first, because it advertises that you can program everything natively. This might seem true, but it is really not. Very few things in the Android SDK have native counterparts in the Android NDK, so whenever you need to start another Activity or stream audio or play video, or set up layouts; this needs to be done using JNI.
Using JNI to set up even simple things in C/C++ can become very messy, an example of this is the creation of an AudioTrack instance using JNI in C:
AudioTrackClass = (*env)->FindClass(env, "android/media/AudioTrack"); AudioTrackInit = (*env)->GetMethodID(env, AudioTrackClass, "<init>", "(IIIIII)V"); jobject track = (*env)->NewObject(env, AudioTrackClass, AudioTrackInit, STREAM_MUSIC, sampleRateInHz, channelConfig, audioFormat, 1024, MODE_STREAM);
Compare this to:
AudioTrack track = new AudioTrack(STREAM_MUSIC, sampleRateInHz, channelConfig, audioFormat, 1024, MODE_STREAM)
I doubt that the JNI way of looking up strings for methods and binding calls with Java is any more efficient than just doing it in Java. Memory references are still owned and garbage collected by Dalvik, the only difference is that you get an extra reference that you manually need to release.
The main loop in Devil’s attorney
In Devil’s Attorney, among other things we do the following using the Android SDK in Java:
- Play the intro video
- Stream audio tracks
- Start a separate APK Expansion Files Downloader Activity
- Query the device for resolution, type of device and version of Android OS
- Handle lifecycle events
- Show the Splash Screen
- Handle user input
While we have the above requirements for the Java part of the application, we also need to have a main loop thread running the native code of the game. Instead of going with either NativeActivity or GLSurfaceView, we decided to roll our own solution which satisfies the following:
- No recurring calls to and from the Java part of the game and the native part during each frame
- User input events and lifecycle messages are queued and double buffered to the native part of the application and consumed each frame
- Ease of using parts of the Android SDK in conjunction with the native part of the game
The Native Part of Devil’s Attorney
The only vital thing that the game needs from Android OS in order to set up an EGL context in native code, is to satisfy the parameters to following function:
eglCreateWindowSurface(display, config, _window, 0); where display <- can be deduced in native code, config <- can be deduced in native code, window <- Platform specific reference, needed from Android OS
The window reference is platform specific, and can be acquired in Java by instantiating something called a SurfaceView. A SurfaceView contains a SurfaceHolder, which is an object that has a dimension as well as a pixel format. The internal data structure of a SurfaceHolder is called a Surface, and is a raw buffer that is used by the compositor before it is blitted to the framebuffer. Of all of these classes, the only one that you actually need to be concerned about is the Surface. This is the only object that you need, in order to create an EGL context. If you pass this object via JNI to native code, you can deduce the window reference by calling a Android NDK library function in the following way:
JNIEXPORT void JNICALL Java_com_senri_da_DevilsAttorneyActivity_nativeSetSurface(JNIEnv* jenv, jobject obj, jobject surface) { ANativeWindow* window = ANativeWindow_fromSurface(jenv, surface); }
By decoupling the java part of the application, the native part holds the actual main loop of the game, and runs in a natively created thread using pthreads. The main loop does roughly this:
while(running) { processMessagesAndUserInput(); update(); draw(); swapBuffers(); }
The processMessagesAndUserInput method locks a semaphore, copies data containing messages and user input, and acts as a barrier for the following events:
- EGL Context Creation
- OpenGL state saving and the destruction of the EGL Context
- Pausing of the game
- Resuming of the game
- Hardware button events
- Touch Events
update(), draw() and swapBuffers() do the bulk of the work and can be found in your run-to-the-mill main loop.
The Java part of Devil’s Attorney
In the Android Activity on the Java side of things, we have the following:
- JNI Methods to play videos and streaming audio.
- Code that identifies the resolution of the screen and creates a SurfaceView, that it later passed to the native main loop via a message
- Handles callbacks from the SurfaceHolder, more specifically: surfaceChanged, surfaceCreated, and surfaceDestroyed.
- Callbacks for touch events and other kinds of user input
- An ability to start the APK Expansion Files Downloader Activity
- Handles all kinds of lifecycle events, minimizing the amount of messages needed to be passed to the native main loop
- Presents the Splash screen
Android lifecycle events and compatibility issues
Android lifecycle events can become very complicated when using an Activity in conjunction with a Surface. Our game supports 2563 (!) different kinds of devices, and heavy testing on multiple devices running different flavors of Android required us to be very careful when setting up the main loop. This includes the following assumptions:
- Surfaces can be destroyed at any time
- Activities can be destroyed at any time
- The EGL Context is never preserved in any way by Dalvik when the app is suspended (although some devices claim that they do).
- Mismatches between pixel format queries for the Surface and EGL Context parameters
- A complete restart of the application with the native parts still alive. This occurs when Dalvik starts killing everything, except for the native pthread. This is fairly common on Gingerbread devices with low RAM.
- Special considerations when constructing the SurfaceView. We use a deprecated layout called AbsoluteLayout, which proved to have the least problems across devices.
- Post OpenGL context creation issues, such as a for high resolution devices with weak GPUs (Adreno). This requires a recreation of the Surface as well as the EGL Context.
- Multiple safe checks between api calls querying and setting the size of the frame buffer/SurfaceView because some devices lie in different ways depending on OS version and manufacturer. This is especially prevalent on Android Gringerbread and Honeycomb devices that have softkeys.
- Saving configuration information if a device crashes during initialization, so that when the user restarts the game it goes into safe-mode. This is not something that is common, but if it affects 1 user in 1000 we need to fix it. This is a nightmare than can be avoided when so many users of Android customize their roms with faulty graphics drivers and experimental features.
- A meticulous locking sequence of threads when surfaces are destroyed. Destruction of surfaces needs to be thread locked so that the caller of surfaceDestroyed is frozen until the entire EGL Context (textures, states, framegrab of the game) is preserved to RAM/Disk.
- Very conservative usage of extensions and OpenGL capabilities. Trusting OpenGL extension queries on old roughed up Android devices is not a safe bet.
Conclusions
There might be a lot of better ways of setting up main loops on Android, but this solution proved to be the best for us. We don’t get a lot of compatibility support mail and the game seems to run fine on most devices. Supporting every single device on Android remains a challenge, but most of the issues lie in the main loop.
On a side note, I’m currently investigating why glReadPixels is not working on a device that is “running LiquidSmooth 2.8 which is an AOSP based ROM, although it has a few CyanogenMod tweaks and features”.