http://www.sunsetlakesoftware.com/2012/02/12/introducing-gpuimage-framework



I'd like to introduce a new open source framework that I've written, called GPUImage. The GPUImage framework is a BSD-licensed iOS library (for which the source code can be found on Github) that lets you apply GPU-accelerated filters and other effects to images, live camera video, and movies. In comparison to Core Image (part of iOS 5.0), GPUImage allows you to write your own custom filters, supports deployment to iOS 4.0, and has a slightly simpler interface. However, it currently lacks some of the more advanced features of Core Image, such as facial detection.

UPDATE (4/15/2012): I've disabled comments, because they were getting out of hand. If you wish to report an issue with the project, or request a feature addition, go to its GitHub page. If you want to ask a question about it, contact me at the email address in the footer of this page, or post in the new forum I have set up for the project.

About a year and a half ago, I gave a talk at SecondConf where I demonstrated the use of OpenGL ES 2.0 shaders to process live video. The subsequent writeup and sample code that came out of that proved to be fairly popular, and I've heard from a number of people who have incorporated that video processing code into their iOS applications. However, the amount of code around the OpenGL ES 2.0 portions of that example made it difficult to customize and reuse. Since much of this code was just scaffolding for interacting with OpenGL ES, it could stand to be encapsulated in an easier to use interface.

Example of four types of video filters

Since then, Apple has ported some of their Core Image framework from the Mac to iOS. Core Image provides an interface for doing filtering of images and video on the GPU. Unfortunately, the current implementation on iOS has some limitations. The largest of these is the fact that you can't write your own custom filters based on their kernel language, like you can on the Mac. This severely restricts what you can do with the framework. Other downsides include a somewhat more complex interface and a lack of iOS 4.0 support. Others have complained about some performance overhead, but I've not benchmarked this myself.

Because of the lack of custom filters in Core Image, I decided to convert my video filtering example into a simple Objective-C image and video processing framework. The key feature of this framework is its support for completely customizable filters that you write using the OpenGL Shading Language. It also has a straightforward interface (which you can see some examples of below) and support for iOS 4.0 as a target.

Note that this framework is built around OpenGL ES 2.0, so it will only work on devices that support this API. This means that this framework will not work on the original iPhone, iPhone 3G, and 1st and 2nd generation iPod touches. All other iOS devices are supported.

The following is my first pass of documentation for this framework, an up-to-date version of which can be found within the framework repository on GitHub:

General architecture

GPUImage uses OpenGL ES 2.0 shaders to perform image and video manipulation much faster than could be done in CPU-bound routines. It hides the complexity of interacting with the OpenGL ES API in a simplified Objective-C interface. This interface lets you define input sources for images and video, attach filters in a chain, and send the resulting processed image or video to the screen, to a UIImage, or to a movie on disk.

Images or frames of video are uploaded from source objects, which are subclasses of GPUImageOutput. These include GPUImageVideoCamera (for live video from an iOS camera) and GPUImagePicture (for still images). Source objects upload still image frames to OpenGL ES as textures, then hand those textures off to the next objects in the processing chain.

Filters and other subsequent elements in the chain conform to the GPUImageInput protocol, which lets them take in the supplied or processed texture from the previous link in the chain and do something with it. Objects one step further down the chain are considered targets, and processing can be branched by adding multiple targets to a single output or filter.

For example, an application that takes in live video from the camera, converts that video to a sepia tone, then displays the video onscreen would set up a chain looking something like the following:

GPUImageVideoCamera -> GPUImageSepiaFilter -> GPUImageView

A small number of filters are built in:

  • GPUImageBrightnessFilter
  • GPUImageContrastFilter
  • GPUImageSaturationFilter
  • GPUImageGammaFilter
  • GPUImageColorMatrixFilter
  • GPUImageColorInvertFilter
  • GPUImageSepiaFilter: Simple sepia tone filter
  • GPUImageDissolveBlendFilter
  • GPUImageMultiplyBlendFilter
  • GPUImageOverlayBlendFilter
  • GPUImageDarkenBlendFilter
  • GPUImageLightenBlendFilter
  • GPUImageRotationFilter: This lets you rotate an image left or right by 90 degrees, or flip it horizontally or vertically
  • GPUImagePixellateFilter: Applies a pixellation effect on an image or video, with the fractionalWidthOfAPixel property controlling how large the pixels are, as a fraction of the width and height of the image
  • GPUImageSobelEdgeDetectionFilter: Performs edge detection, based on a Sobel 3x3 convolution
  • GPUImageSketchFilter: Converts video to a sketch, and is the inverse of the edge detection filter
  • GPUImageToonFilter
  • GPUImageSwirlFilter
  • GPUImageVignetteFilter
  • GPUImageKuwaharaFilter: Converts the video to an oil painting, but is very slow right now

but you can easily write your own custom filters using the C-like OpenGL Shading Language, as described below.

Adding the framework to your iOS project

Once you have the latest source code for the framework, it's fairly straightforward to add it to your application. Start by dragging the GPUImage.xcodeproj file into your application's Xcode project to embed the framework in your project. Next, go to your application's target and add GPUImage as a Target Dependency. Finally, you'll want to drag the libGPUImage.a library from the GPUImage framework's Products folder to the Link Binary With Libraries build phase in your application's target.

GPUImage needs a few other frameworks to be linked into your application, so you'll need to add the following as linked libraries in your application target:

  • CoreMedia
  • CoreVideo
  • OpenGLES
  • AVFoundation
  • QuartzCore

You'll also need to find the framework headers, so within your project's build settings set the Header Search Paths to the relative path from your application to the framework/ subdirectory within the GPUImage source directory. Make this header search path recursive.

To use the GPUImage classes within your application, simply include the core framework header using the following:

#import "GPUImage.h"

As a note: if you run into the error "Unknown class GPUImageView in Interface Builder" or the like when trying to build an interface with Interface Builder, you may need to add -ObjC to your Other Linker Flags in your project's build settings.

Performing common tasks

Filtering live video

To filter live video from an iOS device's camera, you can use code like the following:

GPUImageVideoCamera *videoCamera = [[GPUImageVideoCamera alloc] initWithSessionPreset:AVCaptureSessionPreset640x480 cameraPosition:AVCaptureDevicePositionBack];
GPUImageFilter *customFilter = [[GPUImageFilter alloc] initWithFragmentShaderFromFile:@"CustomShader"];
GPUImageView *filteredVideoView = [[GPUImageView alloc] initWithFrame:CGRectMake(0.0, 0.0, viewWidth, viewHeight)];
 
// Add the view somewhere so it's visible
 
[videoCamera addTarget:thresholdFilter];
[customFilter addTarget:filteredVideoView];
 
[videoCamera startCameraCapture];

This sets up a video source coming from the iOS device's back-facing camera, using a preset that tries to capture at 640x480. A custom filter, using code from the file CustomShader.fsh, is then set as the target for the video frames from the camera. These filtered video frames are finally displayed onscreen with the help of a UIView subclass that can present the filtered OpenGL ES texture that results from this pipeline.

Processing a still image

There are a couple of ways to process a still image and create a result. The first way you can do this is by creating a still image source object and manually creating a filter chain:

UIImage *inputImage = [UIImage imageNamed:@"Lambeau.jpg"];
 
GPUImagePicture *stillImageSource = [[GPUImagePicture alloc] initWithImage:inputImage];
GPUImageSepiaFilter *stillImageFilter = [[GPUImageSepiaFilter alloc] init];
 
[stillImageSource addTarget:stillImageFilter];
[stillImageSource processImage];
 
UIImage *currentFilteredVideoFrame = [stillImageFilter imageFromCurrentlyProcessedOutput];

For single filters that you wish to apply to an image, you can simply do the following:

GPUImageSepiaFilter *stillImageFilter2 = [[GPUImageSepiaFilter alloc] init];
UIImage *quickFilteredImage = [stillImageFilter2 imageByFilteringImage:inputImage];

Writing a custom filter

One significant advantage of this framework over Core Image on iOS (as of iOS 5.0) is the ability to write your own custom image and video processing filters. These filters are supplied as OpenGL ES 2.0 fragment shaders, written in the C-like OpenGL Shading Language.

A custom filter is initialized with code like

GPUImageFilter *customFilter = [[GPUImageFilter alloc] initWithFragmentShaderFromFile:@"CustomShader"];

where the extension used for the fragment shader is .fsh. Additionally, you can use the -initWithFragmentShaderFromString: initializer to provide the fragment shader as a string, if you would not like to ship your fragment shaders in your application bundle.

Fragment shaders perform their calculations for each pixel to be rendered at that filter stage. They do this using the OpenGL Shading Language (GLSL), a C-like language with additions specific to 2-D and 3-D graphics. An example of a fragment shader is the following sepia-tone filter:

varying highp vec2 textureCoordinate;
 
uniform sampler2D inputImageTexture;
 
void main()
{
    lowp vec4 textureColor = texture2D(inputImageTexture, textureCoordinate);
    lowp vec4 outputColor;
    outputColor.r = (textureColor.r * 0.393) + (textureColor.g * 0.769) + (textureColor.b * 0.189);
    outputColor.g = (textureColor.r * 0.349) + (textureColor.g * 0.686) + (textureColor.b * 0.168);    
    outputColor.b = (textureColor.r * 0.272) + (textureColor.g * 0.534) + (textureColor.b * 0.131);
 
	gl_FragColor = outputColor;
}

For an image filter to be usable within the GPUImage framework, the first two lines that take in the textureCoordinate varying (for the current coordinate within the texture, normalized to 1.0) and the inputImageTexture varying (for the actual input image frame texture) are required.

The remainder of the shader grabs the color of the pixel at this location in the passed-in texture, manipulates it in such a way as to produce a sepia tone, and writes that pixel color out to be used in the next stage of the processing pipeline.

One thing to note when adding fragment shaders to your Xcode project is that Xcode thinks they are source code files. To work around this, you'll need to manually move your shader from the Compile Sources build phase to the Copy Bundle Resources one in order to get the shader to be included in your application bundle.

Filtering and re-encoding a movie

Movies can be loaded into the framework via the GPUImageMovie class, filtered, and then written out using a GPUImageMovieWriter. GPUImageMovieWriter is also fast enough to record video in realtime from an iPhone 4's camera at 640x480, so a direct filtered video source can be fed into it.

The following is an example of how you would load a sample movie, pass it through a pixellation and rotation filter, then record the result to disk as a 480 x 640 h.264 movie:

movieFile = [[GPUImageMovie alloc] initWithURL:sampleURL];
pixellateFilter = [[GPUImagePixellateFilter alloc] init];
GPUImageRotationFilter *rotationFilter = [[GPUImageRotationFilter alloc] initWithRotation:kGPUImageRotateRight];
 
[movieFile addTarget:rotationFilter];
[rotationFilter addTarget:pixellateFilter];
 
NSString *pathToMovie = [NSHomeDirectory() stringByAppendingPathComponent:@"Documents/Movie.m4v"];
unlink([pathToMovie UTF8String]);
NSURL *movieURL = [NSURL fileURLWithPath:pathToMovie];
 
movieWriter = [[GPUImageMovieWriter alloc] initWithMovieURL:movieURL size:CGSizeMake(480.0, 640.0)];
[pixellateFilter addTarget:movieWriter];
 
[movieWriter startRecording];
[movieFile startProcessing];

Once recording is finished, you need to remove the movie recorder from the filter chain and close off the recording using code like the following:

[pixellateFilter removeTarget:movieWriter];
[movieWriter finishRecording];

A movie won't be usable until it has been finished off, so if this is interrupted before this point, the recording will be lost.

Sample applications

Several sample applications are bundled with the framework source. Most are compatible with both iPhone and iPad-class devices. They attempt to show off various aspects of the framework and should be used as the best examples of the API while the framework is under development. These include:

ColorObjectTracking

A version of my ColorTracking example ported across to use GPUImage, this application uses color in a scene to track objects from a live camera feed. The four views you can switch between include the raw camera feed, the camera feed with pixels matching the color threshold in white, the processed video where positions are encoded as colors within the pixels passing the threshold test, and finally the live video feed with a dot that tracks the selected color. Tapping the screen changes the color to track to match the color of the pixels under your finger. Tapping and dragging on the screen makes the color threshold more or less forgiving. This is most obvious on the second, color thresholding view.

SimpleImageFilter

A bundled JPEG image is loaded into the application at launch, a filter is applied to it, and the result rendered to the screen. Additionally, this sample shows two ways of taking in an image, filtering it, and saving it to disk.

MultiViewFilterExample

From a single camera feed, four views are populated with realtime filters applied to camera. One is just the straight camera video, one is a preprogrammed sepia tone, and two are custom filters based on shader programs.

FilterShowcase

This demonstrates every filter supplied with GPUImage.

BenchmarkSuite

This is used to test the performance of the overall framework by testing it against CPU-bound routines and Core Image. Benchmarks involving still images and video are run against all three, with results displayed in-application.

Things that need work

This is just a first release, and I'll keep working on this to add more functionality. I also welcome any and all help with enhancing this. Right off the bat, these are missing elements I can think of:

  • Images that exceed 2048 pixels wide or high currently can't be processed on devices older than the iPad 2 or iPhone 4S.
  • Currently, it's difficult to create a custom filter with additional attribute inputs and a modified vertex shader.
  • Many common filters aren't built into the framework yet.
  • Video capture and processing should be done on a background GCD serial queue.
  • I'm sure that there are many optimizations that can be made on the rendering pipeline.
  • The aspect ratio of the input video is not maintained, but stretched to fill the final image.
  • Errors in shader setup and other failures need to be explained better, and the framework needs to be more robust when encountering odd situations.

Hopefully, people will find this to be helpful in doing fast image and video processing within their iOS applications.

      
Posted by k_ben