## Performance of the unwrapIris()

Published 11 February 2014After the performance improvement from yesterday, I wanted to try some more things, because the speed of this was still not satisfactory (I spent an hour processing 2000 images).

So I’ve whipped up line_profiler again:

This gave me the following trace:

Hmmm, 87% of the time is spent in `unwrapIris()`

. Lets take a look at what that looks like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

def unwrapIris(image, iris_center, iris_radius, nsamples=360, show=False):
samples = np.linspace(0, 2 * np.pi, nsamples)[:-1]
polar = np.zeros((iris_radius, nsamples), dtype=np.uint8)
for r in range(iris_radius):
for theta in samples:
x = r * np.cos(theta) + iris_center[0]
y = r * np.sin(theta) + iris_center[1]
try:
polar[r][theta * nsamples / 2.0 / np.pi] = image[y][x]
except IndexError:
polar[r][theta * nsamples / 2.0 / np.pi] = 0
if show:
cv2.imshow("Iris", polar)
return polar

And its profile:

So we can see that the load here is distributed between 3 lines, 2 trigonometric calculations and one array access and some arithmetic. But the main problem is that these 3 lines are run for each pixel, for 10 images processed in this sample this is 475 200 hits. If I have learned anything about Python performance in the past year, it is to move all loops to an external library if possible. In this case we’ll look at how could this be moved over to either Numpy or OpenCV. Since these libraries are written in C or C++, they can perform the same loop much faster than plain Python.

Thanks to Numpy’s operator overloading, it is possible to write code that looks really weird at the
first glance, if you’re used to loops from other languages. Look at the lines 6 and 7. `magnitude`

is an array, `angle`

is an array and `iris_center`

is an `int`

as it is used here. Numpy can
calculate this correctly and the loops are now in the library, yay!

So what changed? Line 4 makes two arrays, one with angles and one with magnitudes, both of dimensions
`iris_radius×nsamples`

which is about `359x130`

for most of my images. The angle array is
basically just a single row of radian values from 0 to 2pi repeated 130 times, and magnitude is
a column from 0 to 129 repeated horizontally 359 times.

The lines 6 and 7 convert these two arrays to X,Y coordinates in the image, from which we will be
sampling later. `convertMaps()`

converts the them further to improve performance of `cv2.remap()`

.
And finally `cv2.remap()`

maps these coordinate maps from the original image into the polar image.

1
2
3
4
5
6
7
8
9
10

def unwrapIris(image, iris_center, iris_radius, nsamples=360, show=False):
samples = np.linspace(0, 2 * np.pi, nsamples)[:-1]
angle, magnitude = np.meshgrid(samples, np.arange(iris_radius))
x = magnitude * np.cos(angle) + iris_center[0]
y = magnitude * np.sin(angle) + iris_center[1]
x, y = cv2.convertMaps(x.astype('float32'), y.astype('float32'), cv2.CV_32FC1)
return cv2.remap(image, x, y, cv2.INTER_LINEAR)

And now lets take a look at what that did to the performance:

Nice! Now `unwrapIris()`

is taking just 2.3% of the total time! And we have reduced time needed to process
10 images from ~34s to about 4s, that’s an order of magnitude improvement! Now the `findIris()`

is the
slowest, maybe next time we’ll look at that.