Generating all possible pictures

Think of an image of 800 x 600 pixel and 24 bit of color (8 bit per each RGB component). Its trivial binary representation is a sequence of 11520000 bits (800 x 600 x 24) and we can think of each picture as being a natural number.

Imagine now that we write an computer program that generates all these pictures one by one, incrementing the natural number by one in each round.

Running this algorithm for enough time you would eventually get:

- a picture of your face
- a picture of you in the Moon
- a picture of you with Marlin Monroe and James Dean
- pictures of ancient Earth, with dinosaurs
- pictures of all the paintings of Leonardo da Vinci, Van Gogh or Picasso
- pictures of all the pages of Shakespeare’s writings
- pictures of proofs of all relevant mathematical theorems (already proved or not)
- pictures of all great music compositions (already written or not)
- pictures of Microsoft Office and Windows source code
- pictures/printscreens of all pages in the World Wide Web, including all the versions of Wikipedia

Warning: don’t do this at home unless you can wait for some billion years between each pair of interesting pictures you would get!

Still, it’s interesting to realize that you can compress all the world’s information to a short and trivial program, all you have to do is add enough useless data to it!

On Kolmogorov Complexity

This semester I assisted to a course on Kolmogorov Complexity and Minimum Description Length, given at the Institute for Logic, Language and Computation of the University of Amsterdam.
The main lecturer, Paul Vitanyi, is the c0-author of THE book on Kolmogorov Complexity. Peter Grünwald, invited professor in two lectures, is the author of THE book on the Miniumum Description Length principle. The exercise sessions were given by a talented PhD student of them, Wouter Koolen-Wijkstra.

But what is this all about?
In simple words, the Kolmogorov Complexity of a sequence of characters is the length of the shortest program that outputs that sequence. It is the theoretical limit of compression.

Science in Summer time

If everything goes as planned this year I am attending two Summer Schools.

The first one, the International Computer Vision Summer School 2008 , will be hosted in Sicily, Italy in 14-19 July. The program seems to be quite good and it will cover topics like object detection, tracking or 3D reconstruction, among others. There’s also a reading group on “how to conduct a literature review and discover the context of an idea“. The challenge is to see how far back in the past one can track the origins of a scientific idea. For example, the AdaBoost is a well known machine learning meta-algorithm, in which a sequence of classifiers is progressively trained focusing on the instances misclassified by previous classifiers. The set of classifiers is then combined by a weighted average. It was introduced by Freund and Schapire in 1996. This is easy to track, the question however is: can you find the same or similar core idea, or intution, somewhere else back in the past? Possibly from a different domain?
It’s gonna be fun!

The second one is the 10th Machine Learning Summer School, 1-15 September, Ile de Re, France. The program is also quite nice, but I still don’t have the confirmation I can attend it.
I would be specially interested in Rich Sutton‘s lecture on “Reinforcement Learning and Knowledge Representation” although hearing about Active Learning, Bayesian Learning, Clustering, Kernel Methods, etc. also sounds quite appealing.

Looking forward to science in summer time!