Sunday, December 14, 2014

Scala Bility

In the Coursera 'Functional Programming Principles in Scala' course there's an example of representing a set as a binary tree. In the lecture, Dr Ordersky defines a union method of a set returning a new set containing both the object and the argument. It's defined for the NonEmpty set as:

class NonEmpty(elem: Int, left: IntSet, right: IntSet) extends IntSet {
...
def union(other: IntSet): IntSet =
    ((left union right) union other) incl elem
...
}

There's another method, incl, that returns a set including that element. There's an exercise following the class where the task is to create a TweetSet class with similar functionality.

Unfortunately, defining union as described above causes the program to hang at the first access of the test dataset. It turns out the above is so slow creating a union of 7 sets, each with 100 elements, that it never finishes. One solution is to replace the union function with the following (just rearranging brackets):

class NonEmpty(elem: Tweet, left: TweetSet, right: TweetSet) extends TweetSet {
...
def union(other: TweetSet): TweetSet =
    left union (right union (other incl elem))
...
}

After which it evaluates almost immediately. So why is there such a difference between the two? We can compare performance by counting the number of times the union function gets called for varying set sizes, where Sn is a set with n elements, and u is the union function:


MethodS10uS10S10uS1S20uS1S50uS1
((LuR)uO)inclE90902,1792,131,662
Lu(Ru(OinclE))10102050

We can see here that the only thing that affects the amount of times union gets called is the size of the left array S10uS10 has the same calls as S10uS1, and so I used S1 on the right hand side for the other tests. We can also see how much less efficient the first method is: It takes over 2 million calls for a set of 50 elements, versus only 50 for the second method.

For the following, we consider AuB, where A has length N, and the left tree of A has length M (0 <= M <= N-1), leaving the right tree with N-M-1. We define T(N) as the time it takes to union two trees where the left hand side has length N, therefore AuB will take T(N), and that T(0)=1 and the cost of calling union once is 1.

In the first method, the first step (LuR) will take T(M) and return a tree of length N-1. Then (LuR)uO will take T(N-1). This gives us T(N)=T(M)+T(N-1)+1. In the best case scenario, M=0 and T(M)=T(0)=1. We then get O(N^2) execution time. In the worst case scenario, M=(N-1) and we have T(N) = 2*T(N-1), giving O(2^N) execution time. From the data above, I empirically get something like 1.4^N time.

For the second method, Ru(OinclE), will take T(N-M-1) and Lu(Ru(OinclE)) will take T(M) giving T(N) = T(N-M-1)+T(M)+1 with T(0)=1. This works out at O(N) execution time, as we see above.

There's a lot of hype surrounding functional programming at the moment. I was quite impressed with the ability to define data structures from really basic, almost trivially correct functions. Unfortunately, it's not possible to blindly implement things without considering the performance overhead, especially in recursive functions. Whether it's possible for smart compilers to optimize away problems like this or not, for the moment writing efficient functional programs relies on very careful analysis, and hacks like tail-recursion. As seen above, two functionally identical procedures can have wildly different execution times with just a simple rearrangement of parentheses, and predicting which one is more efficient isn't (to me at least) intuitive.

Monday, August 11, 2014

Writing PNG files in Python

Writing PNG files in Python

A while ago I was trying to work out how to show a chart of some values on a web based UI. I'd originally though it might be fun to try and write PNGs from scratch, especially as the filtering included lends itself well to blocks of solid color, exactly what you'd expect from a simple 2-color bar chart. I got discouraged when looking at the PNG spec, and all of the libraries wanted to deal with image data (I really just wanted bars of a certain length, and didn't necessarily want to create an image out of them), so I gave up and used SVG instead.

But I started looking into it again recently to see if it really was that complicated to write PNGs, and after working out that the built-in zlib module has the compression and CRC functionality already there, it turned out to be quite simple.

To write a PNG block, you specify the length of the data, a 4-byte header, the data itself, and a CRC of everything except the length:

def yield_block(header, data):
    assert len(header)==4, 'header must be 4 bytes!'
    # length:
    yield struct.pack('! L', len(data))
    # chunk type, 4 byte header
    yield header
    # data
    yield data
    # crc
    yield struct.pack(
        '! L',
        zlib.crc32("".join([header, data])) & 0xffffffff)

I decided to yield everything instead of writing to a file, because of something.

Then it's just a case of writing a magic header, a 'IHDR' block with information about the image, and 'IDAT' block with the data, and finally an empty 'IEND' block.

I decided to write a function that takes a list of numbers and outputs a simple 1-bit png. Each row has a number of white pixels as specified in the list, followed by enough black pixels to finish the line. I decided not to filter the scan lines in the end (I think it would make more sense to do so if I were to use bytes instead of bits for the pixel (and I didn't check to see if bits really saved space over bytes either)):

def make_bar_png(data, dmax='auto'):
    def make_line(line):
        r = [0xFF] * (line // 8)
        remaining_zeros = 8 - line % 8
        if remaining_zeros != 8:
            r.append(0xff ^ ((1 << remaining_zeros) - 1))
        r += [0x00] * (bytes_per_line - len(r))
        return r
        
    if dmax=='auto':
        dmax = max(data)
    bytes_per_line = (dmax+7) // 8
    width = dmax  
    height = len(data)
    bit_depth = 1
    color_type = 0  # grayscale
    compression_method, filter_method, interlace_method = 0, 0, 0
    yield bytearray([0x89, 'P', 'N', 'G', '\r', '\n', 0x1A, '\n'])
    # our header block
    for b in yield_block('IHDR', 
                         struct.pack('! LLBBBBB',
                                     width, height, bit_depth,
                                     color_type, compression_method, 
                                     filter_method, interlace_method)):
        yield b
    #unfiltered data (start with 0 as filterbyte)
    dat = [str(bytearray([0]+make_line(x))) for x in data]
    for b in yield_block('IDAT',
                         zlib.compress("".join(dat))):
        yield b
        
    for b in yield_block('IEND', ''):
        yield b

And that's it, a simple horizontal PNG bar graph generator, just using the Python standard library.

Here's the first 12 digits of pi, as a (very small) bar graph, enjoy.

The code, and a couple of simple tests, are on github. It could probably be extended to make vertical bar graphs, and use the difference-of-preceding byte (or byte above) filtering to make the files smaller, but it works as it is.

Sunday, June 8, 2014

Node.js mandelbrot server

Node.js mandelbrot server


After trying in python and go, I thought I'd give node.js a try, to see how it performs against those two.

Once I found a png library for node.js, it was pretty easy to get working (definitely less to learn than the go version). It also performed pretty decently, being only about twice as slow as the go version.

The code's available here and I hope is pretty self explanatory. If you want to run it, use npm to install node-png then run the file. It should work with the html page from the python post.

Performance

After finding that everything seems faster than Python, I spent a bit of time optimizing the Python version, it no longer uses complex numbers but stores the real and imaginary parts in two separate images. It's still pretty readable. After doing that, the times were:

LanguageTime (s)
Python13.4
Go2.6
Node.js4.5

It's worth noting that these are the times for serial mandelbrot generation, both Python and go use multiple threads, but the node.js version doesn't (although still feels faster than the threaded Python version when run from the browser). Javascript isn't my language of choice (although it now seems fashionable to bash javascript-bashers) but it is nice to have a language that's literally on every computer, and that has had many man hours spent on optimizing the compiler to produce impressively fast code. 

Saturday, March 29, 2014

Frequent Faces

Frequent Faces

There was an interesting article a while ago (it looks like 2006 to be exact) where an image of Einstein was combined with an image of Marilyn Monroe. The resultant hybrid image contained high-frequency information from the Einstein image, and low-frequency information from the Marilyn image:

When you look at the composite image above, it looks like Einstein up close (where you can see the high frequency information, and it dominates) or Marilyn from a distance (or, if you're lucky enough to be short-sighted, without glasses) where the high-frequency information cannot be resolved and only the low frequency info gets through.

Creating Hybrid Images

The actual process for creating images like this is relatively simple. You take a Fourier transform of both images, remove (filter) the high frequency parts from one image (which are just the pixels away from the center) and the low frequency (pixels near the center) from the other, then add the images together. So all you need is a math library with a Fourier transform function. Like NumPy and Python.

The hardest part is actually finding two images that are similar enough to each other that they don't clash too much when combined (who knew Einstein and Marilyn looked so similar?).

I decided to try and combine Patrick Stewart and John Lennon, I found some photos online, lined them up and made them the same size using a paint program.


Next we need a way to filter the Fourier transforms. There's a lot of different ways to do this (eg Butterworth) but I decided to keep it simple and use a Gaussian filter. The code in python looks like:

def gauss_mask(shape, low):
    """
    Return a mask suitable for use on a reduced real FT
    IE zero frequency at 0,0 next at 0,1 1,0 and h-1,0
    """
    h, w = shape
    outw=w/2+1 # if reduced
    # we have full heightfreqs:
    irow = np.fft.fftfreq(h).reshape(h,1)
    # cols are halfed
    icol = np.fft.fftfreq(w)[:outw].reshape(1,outw)
    r = np.exp(-(icol*icol+irow*irow)/(low*low))
    return r

Then it's just a case of applying this to the source images with an appropriate cutoff:

lowf = np.array(PIL.Image.open(r"stewart_edit.jpg"))
highf = np.array(PIL.Image.open(r"lennon_edit.jpg"))
h,w,_ = lowf.shape

hipass = 1-gauss_mask((w,h), 15.0/768)
lowpass = gauss_mask((w,h), 15.0/768)
lowout=np.fft.irfft2(lowpass*np.fft.rfft2(lowf[:,:,2]))
hiout=np.fft.irfft2(hipass*np.fft.rfft2(highf[:,:,2]))
imsave("patrick_low.png", lowout, cmap='gray')
imsave("lennon_high.png", hiout, cmap='gray')

The resulting images look like:


When we add the together (with some suitable weighting) to get:


Which, apart from looking like Harry Potter, has the properties we want -- looks mostly like John Lennon up close and then Patrick Stewart from a distance.

I was using the iPython Notebook for doing this, as it allowed me to easily play around with the parameters and see the intermediate results as I was going along. I've put the both the ipynb file and the plain python code on github (requires NumPy, and iPython for showing images).

Friday, January 3, 2014

Bottle with SSL

Bottle with SSL

I was looking into using the bottle python web framework with SSL, and came across a few links on Google. They all seemed to use other third party modules for adding SSL support, and when I came across this post explaining how to add SSL to the built-in SimpleHTTPServer, I wondered if the same could be done with bottle.

It can. We just need to create a custom bottle.ServerAdapter which wraps the socket with ssl. The following should create a very basic SSL server using bottle (you need to create a ssl certificate first, I'm assuming it's called 'server.pem' in the following):

# Trying SSL with bottle
# ie combo of http://www.piware.de/2011/01/creating-an-https-server-in-python/
# and http://dgtool.blogspot.com/2011/12/ssl-encryption-in-python-bottle.html
# without cherrypy?
# requires ssl

# to create a server certificate, run eg
# openssl req -new -x509 -keyout server.pem -out server.pem -days 365 -nodes
# DON'T distribute this combined private/public key to clients!
# (see http://www.piware.de/2011/01/creating-an-https-server-in-python/#comment-11380)
from bottle import Bottle, get, run, ServerAdapter

# copied from bottle. Only changes are to import ssl and wrap the socket
class SSLWSGIRefServer(ServerAdapter):
    def run(self, handler):
        from wsgiref.simple_server import make_server, WSGIRequestHandler
        import ssl
        if self.quiet:
            class QuietHandler(WSGIRequestHandler):
                def log_request(*args, **kw): pass
            self.options['handler_class'] = QuietHandler
        srv = make_server(self.host, self.port, handler, **self.options)
        srv.socket = ssl.wrap_socket (
         srv.socket,
         certfile='server.pem',  # path to certificate
         server_side=True)
        srv.serve_forever()

@get("/x")
def get_x():
 return "Hi there"

#instead of:
#run(host="0.0.0.0", port=8090)
#we use:
srv = SSLWSGIRefServer(host="0.0.0.0", port=8090)
run(server=srv)

And that's it. You should now be able to go to https://localhost:8090/x and see 'Hi there'. The file's also available on github.