Genetic Programming: Code smarter than you!

Here in sweden there are currently circulating some email with a challange about solving a math puzzle.
It’s nothing special really but the mail goes something like this (translated):
(Don’t blame me for the claims in the quote, it’s not my words)

It’s said that only people with an IQ over 120 can solve the following problem:

If we assume that:
2 + 3 = 10
7 + 2 = 63
6 + 5 = 66
8 + 4 = 96

How much is?
9 + 7 = ????

The mail contains an excel file with a password and can only be opened if you know the answer to the above.

Just for the hell of it I entered the problem into my old genetic expression evolver:
http://rogeralsing.com/2008/02/07/genetic-programming-math/

The application is based on genetic programming and does use genetic crossover and a real population (unlike my EvoLisa sample).

The problem was described like this:
problem.Cases.Add(new Case(2, 3, 10));
problem.Cases.Add(new Case(7, 2, 63));
problem.Cases.Add(new Case(6, 5, 66));
problem.Cases.Add(new Case(8, 4, 96));

The first and second arguments are variable values and the last argument is the expected output.

And here is the output of the application:

As you can see on the screenshot, the application have solved the equation in 250 generations (a few milliseconds).

That’s probably faster than you solved it ;-)

PS.
If it makes you feel better, I didn’t solve it at all, I go into fetus position on the floor when I see math problems ;-)

//Roger

Evolutionary Compression

When I first posted the “Evolution of Mona Lisa” post, there were a few questions raised regarding how well an evolutionary approach to image compression would work out.
A few days later, some guys at the NVidia forms started to try out the concept using hardware accellerated graphics.
That attempt was fairly successful, it was better than normal JPG, but worse than JPG2000.

Now there is another attempy made by Neil Graham, you can read his article about it here: http://www.screamingduck.com/Article.php?ArticleID=46&Show=ABCE 
This attempt seems to deal much better with fine details than the CUDA version did.

I think that the final results shown in that article are awesome, although I would like to have seen some reference images using other compression algorithms with the same data size.

//Roger

Scaling Clustered Evolution: 1 + 1 = 4

This is a follow up on: http://rogeralsing.com/2009/01/01/clustering-evolution-we-have-lift-off/

Yesterday I got the chance to test run the cluster version of EvoLisa on a 64 bit 4 core machine.
(Thanks to my collegue Ulf Axelsson for the help on this one)

The results from the cluster version are quite interesting.
One could expect that you would get a maximum of 200% performance by using two nodes instead of one.
However, this is not the case, we are getting 400% performance by doing this.

Doubling the CPU capacity and get 4 times as much work done.

How is this possible?

This really confused me for a while.
But the reason is quite obvious once you figure it out.

Let’s assume the following:

* We use one core.
* We are running 1000 generations with 50 polygons over 10 seconds.
* 1 out of 10 mutations are positive.

This gives us approx 100 positive mutations over 10 seconds.

If we add one more core we would get:

* We use two cores in parallel.
* We are running a total of 2000 generations, with 50 polygons per node over 10 seconds.
* 1 out of 10 mutations are positive.

This would give us approx 200 positive mutations over 10 seconds. 
Thus, this would give the expected 200% performance.

BUT:

We are NOT rendering 50 polygons per core in this case.
Each core is only rendering 25 polygons each.

During those 10 seconds, we are actually able to run 2000 generations instead of 1000 per core, thus, running a total of 4000 generations over 10 sec.
Which in turn results in approx 400 positive mutations during the same time span.

We have doubled the CPU capacity and halved the work that needs to be done per node.
Thus, we get a * 2 * 2 perf boost.

Pretty slick :-)

PS.
The 4 core machine was able to render the Mona Lisa image with the same quality as the last image in the evolution series in: 1 min 34 sec!

//Roger

Clustering Evolution – We have lift-off

After some horrible failures trying to cluster EvoLisa, I finally succeeded!
(And some creds goes out to Fredrik Hendeberg on this solution too)

All of my other attempts failed because of the overhead in the communication.
I was trying to synchronize things too often, simply because I didn’t see how it would be possible to solve the original Mona Lisa problem w/o doing so.

The key to successful clustering is to communicate as little as possible and run jobs in isolation as much as possible.

And this is what I’ve done now.
Each node gets ” 50 / NodeCount ” polygons.
If I have 2 nodes, they get 25 each.

So how do I get those two nodes to paint one and the same image with their 25 polygons each?

The solution is quite simple;
Each node runs in isolation for e.g. 1000 generations.
Then they pass their own DNA over to all the other nodes.
The receiving nodes then construct a “foreground image” and a “background image” based on the DNA from the sending nodes, depending on their ordinal/rank/index in the cluster.
Once that process is complete, each node goes back to running another 1000 generations, but now using the static fore and background while rendering.

monapartition

So each node is sort of rendering the entire image, but where all the polygons except for its own are static for X generations.

This works because the longer the simulation runs, the less changes there are to all the polygons.
So each node can fine-tune it’s own polygons together with the static fore and background since we know that the fore and background will not change that much until the next sync.

There will be a bit of a ping-pong effect in the beginning before all of the polygons on all the nodes start to make sense together.
But that doesn’t matter because it is only for a short period before everything start to stabilize.

So we are actually co-evolving one organism per node, where all of the organisms needs to harmonize with each other in order to get a good fitness level.
There is a sort of symbiosis going on now.

As a result of this, the new algorithm is no longer “greedy” and can in some cases result in a worse fitness level than the previous X generations, but the positive effect by far out-weights this.

The effect was quite dramatic on my 2 core laptop.
With a single core, it takes about 12-20 minutes to reach the fitness level that I use for my tests (The same fitness level as the last image as the original Mona Lisa evolution series).

With two cores, I reach the same fitness in 3-5 minutes.
That’s a pretty nice speedup.

The performance gain is actually more than 100%, and my best guess is that the OS is eating some of the power of the first core, the one that is used when running in single core mode.

Now I just need to find a multi core computer to see how it works on more than two cores.

//Roger

Hill climbing, GP, GA, EP, ES

[Edit]
See Maarten’s comments for more info
[/Edit]

There have been allot of discussions about what sort of algorithm the EvoLisa app uses.
Quite a few have claimed that it is a hill climber.

So I think it’s time to set everything straight here.

This is the definition of Hill climbing from wikipedia:

Hill climbing attempts to maximize (or minimize) a function f(x), where x are discrete states. These states are typically represented by vertices in a graph, where edges in the graph encode nearness or similarity of a graph. Hill climbing will follow the graph from vertex to vertex, always locally increasing (or decreasing) the value of f, until a local maximum (or local minimum) xm is reached. Hill climbing can also operate on a continuous space: in that case, the algorithm is called gradient ascent (or gradient descent if the function is minimized).*.

If we look at the EvoLisa app; there are no vertices, there is no graph, there are no edges that encodes nearness.
There is no continious space.

There is an individual with its own set of DNA, and that individual can breed offspring that can be quite far from the parent.
e.g. Each polygon in a drawing could change shape and size, each polygon could change color.
(The likeliness that it happens might be extremely small, but it is not impossible)
There are no paths that can be followed or backtracked like in a graph.

So if we stick to the wikipedia definition of hill climbing, then we can safely say that EvoLisa does not use a hill climber.

So what about my claims about Genetic Programming?

The wikipedia definition of GP is:

GP evolves computer programs, traditionally represented in memory as tree structures. Trees can be easily evaluated in a recursive manner. Every tree node has an operator function and every terminal node has an operand, making mathematical expressions easy to evolve and evaluate. Thus traditionally GP favors the use of programming languages that naturally embody tree structures (for example, Lisp; other functional programming languages are also suitable).

Non-tree representations have been suggested and successfully implemented, such as linear genetic programming which suits the more traditional imperative languages [see, for example, Banzhaf et al. (1998)]. The commercial GP software Discipulus,[6] uses AIM, automatic induction of binary machine code[7] to achieve better performance.[8] MicroGP[9] uses a representation similar to linear GP to generate programs that fully exploit the syntax of a given assembly language

OK, the first statement says GP evolves computer programs, that is sort of fuzzy, a program can be allot more than computational expressions.
But the next highlighted part implies that GP should deal with computational AST’s only.

IMO, for loops, variable declarations and other statements could very well be used in GP .
An AST that contains statements for drawing polygons onto a canvas when evaluated is also a program.
I could very well create a parser for a language that generates that very same AST and everyone using it would agree that it is a language and that they are writing programs when using it.

But let’s ignore what I think, lets stick with the definition here.

If we accept that GP only deals with computational programs and not imperative execution or declarations, then EvoLisa is not GP.

So what would be the correct definition of the algorithm then?

The closest definition would be:
http://en.wikipedia.org/wiki/Evolutionary_programming
or
http://en.wikipedia.org/wiki/Evolution_strategy

EvoLisa fits the 1+1 Evolutionary Strategy.
But it also fits the μ + μ EP since it relies on an AST with mutable values.

Which one of those two it is depends on what definition you set for “a program”
When is an AST an AST and not just data?

e.g. are these statements data or code? “DrawPolygon ( … )” or “print ‘foo’” .

//Roger

Follow

Get every new post delivered to your Inbox.