Using NSOperation to speed up compute-intensive tasks.

The NSOperation class, an abstract class available in Mac OS X 10.5 and newer, is a fantastically simple class to use and makes it very easy to gain performance on Mac OS X computers with multiple processors.

The iOS version of Lens•Lab did not have any multi-processor optimizations in it: the only iOS device with multiple processors is the iPad 2 and it wasn’t even released when I began work on Lens•Lab.

The biggest performance problem I was facing with Lens•Lab is the simulated blur. At the beginning I always feared that any performance problems I was going to see would be related to drawing the background artwork. As it turns out, that job is trivial, performance-wise.

One of the cool things (I think) about Lens•Lab is the blur we do on the near and far out of focus areas. It really drives home the depth of field idea and gives one a more intuitive grasp of how optics works. But I immediately found out that doing a blur in realtime on mobile devices would be a challenge.

(At this point some programmers more smarter than I might ask why I just didn’t use OpenGL for the blur. The answer is that at this point I don’t know OpenGL and learning that would have pushed back the release of Lens•Lab. I really wanted to get it out in the world, plus it was fun getting the blur algorithm tuned and trying to squeeze every drop of sweet, delicious performance that I could out of it.)

After doing some performance analysis and fine-tuning of the blur algorithm I was fairly satisfied with the results. Lens•Lab got released for iOS devices.

One thing that didn’t make the iOS cut is the fancy blur we do where the blur increases gradually as we get further and further from the exact near and far depth of field distances. This gradually increasing blur is super-expensive computation-wise. It could be that I have a crappy algorithm, but I’ve made the algorithm work as fast as I can. So what to do? Multi-cores to the rescue!

I took the blur algorithm code and removed it from where it was (in my subclass of NSView) and moved it to its own object subclassed from NSOperation. The header looks like this:

#import <Cocoa/Cocoa.h>

@interface BlurOperation : NSOperation {

CGImageRef inImage;

int pixelRadius;

float scale;

int iterations;

BOOL near;

CGImageRef outImage;

}

@property CGImageRef outImage;

– (id)initWithImageRef:(CGImageRef)image pixelRadius:(int)radius scale:(float)imageScale iterations:(int)iterationAmount isNear:(BOOL)near;

– (void)main;

@end

So we have some ivars (including references to the inImage and outImage CGImageRefs), an init method that gets the info we need, and a main.

The implementation is as simple as you would expect. The init method takes the properties and sets up the ivars. Main looks like this:

– (void)main {

NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];

….// blur stuff

[pool release];

}

And that’s it. In our NSView subclass, all we need to do is set up an NSOperationQueue when we initialize:

operationQueue = [[NSOperationQueue alloc] init];

And then when we want to perform the blur, we just have to do this:

BlurOperation *blurNear = [[BlurOperation alloc] initWithImageRef:nearCropRef pixelRadius:15 scale:1.0 iterations:kBlurIterations isNear:TRUE];

[operationQueue addOperation:blurNear];

BlurOperation *blurFar = [[BlurOperation alloc] initWithImageRef:farCropRef pixelRadius:20 scale:1.0 iterations:kBlurIterations isNear:FALSE];

[operationQueue addOperation:blurFar];

[operationQueue waitUntilAllOperationsAreFinished];

Of course we release the BlurOperation instances when we’re all done.

What I found when I did this is a 50% speed up in frames per second when manipulating the controls or resizing the window. Which makes sense: the near blur area is roughly 1/2 the size of the far blur area with 1/2 as many pixels to process. When the two operations are added to the queue, the near blur is almost always going to get done first. Either way, taking the compute intensive task of doing this graduated blur in real time and turning it into an NSOperation subclass was really easy and made a huge difference in how fast the Mac version of Lens•Lab runs. Hopefully you can be inspired to try this with your app as well!

The cool thing about this is that I will be able to add this to the iOS version of Lens•Lab and have awesome performance with devices that use the new Apple A5 dual-core processor!

Advertisements

One Response to Using NSOperation to speed up compute-intensive tasks.

  1. […] blogged about NSOperation before, so you know I’m a fan of this class. While the child was taking his morning nap, i whipped […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: