Creativity is Just Connecting Things

"Creativity is just connecting things. When you ask creative people how they did something, they feel a little guilty because they didn’t really do it, they just saw something. It seemed obvious to them after a while. That’s because they were able to connect experiences they’ve had and synthesize new things. And the reason they were able to do that was that they’ve had more experiences or they have thought more about their experiences than other people. (...)

Unfortunately, that’s too rare a commodity. A lot of people in our industry haven’t had very diverse experiences. So they don’t have enough dots to connect, and they end up with very linear solutions without a broad perspective on the problem. The broader one’s understanding of the human experience, the better design we will have."

- Steve Jobs

Defining Intelligence

Intelligence is a slippery thing to define. The following definition recently struck me:

That which allows one to maximize happiness over long term.

I like this definition because it is short (c.f. MDL, Occam's Razor), it makes logical sense, and it carries a lot of meaning without going into details of how to be intelligent. It is logical to me because of the following argument: Suppose a person is allowed to life two versions of his life starting from some fixed point in his life. All events and circumstances in the two versions are the same except for actions taken by the person. Then he can be said to be more intelligent in that version of his life in which he achieves greater happiness over the rest of his life.

Intelligence is needed in order to understand what actions will make us happy, for how long, and whether there will be any effects of those actions on our future happiness. Making decisions to maximize cumulative happiness is certainly a non-trivial task. Sometimes one must put oneself through short-term adversity (e.g. graduate school at little on no stipend, or an athlete undergoing gruelling training for a race) to be able to do well later. Sometimes, one decides to undertake an action that provides short term happiness, but at the cost of long term happiness. It takes intelligence to learn to avoid such behaviour n the future.

Modern definitions of intelligence from the scientific and psychology community are incredibly long-winded [Wikipedia]

A very general mental capability that, among other things, involves the ability to reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly and learn from experience. It is not merely book learning, a narrow academic skill, or test-taking smarts. Rather, it reflects a broader and deeper capability for comprehending our surroundings—"catching on," "making sense" of things, or "figuring out" what to do.

and:

Individuals differ from one another in their ability to understand complex ideas, to adapt effectively to the environment, to learn from experience, to engage in various forms of reasoning, to overcome obstacles by taking thought. Although these individual differences can be substantial, they are never entirely consistent: a given person's intellectual performance will vary on different occasions, in different domains, as judged by different criteria. Concepts of "intelligence" are attempts to clarify and organize this complex set of phenomena. Although considerable clarity has been achieved in some areas, no such conceptualization has yet answered all the important questions, and none commands universal assent. Indeed, when two dozen prominent theorists were recently asked to define intelligence, they gave two dozen, somewhat different, definitions.

The same Wikipedia page also lists various different definitions given by researchers. The long-windedness of these definitions is somewhat excusable as an attempt to be all-inclusive and general. But in the end, the notion of intelligence is a man-made model, invented to try and explain phenomena. I think a focus on happiness as the central phenomenon to be explained goes a long way in simplifying our understanding of intelligence.

In the eye of the Beholder

Why is the picture on the right more appealing than the one on the left?

What is it that we find more interesting about the picture on the right, compared to the one on the left? The picture on the left contains more information. So we are certainly not looking for more information. One might say we don't know how to interpret the image on the left into anything familiar, but it is television static. A more precise answer is given by Jurgen Schmidhuber, who argues convincingly that:

Artists (and observers of art) get rewarded for making (and observing) novel patterns: data that is neither arbitrary (like incompressible random white noise) nor regular in an already known way, but regular in way that is new with respect to the observer's current knowledge, yet learnable (that is, after learning fewer bits are needed to encode the data).

This explains the pictures on top. The picture on the left is not compressible because it is a matrix of uniformly random 0/1 pixels. The Monet on the right evokes familiar feelings, and yet adds something new. I think what Schmidhuber is saying is that the amount of compressibility should neither be too little, nor too much. If  something is not very compressible, then it is too unfamiliar. If something is too compressible, then it is basically boring. In other words, the pleasure derived first increases and then decreases with the compressibility, not unlike this binary entropy curve.

Let us ask the same question again for the following pair of images (you have to pick one over the other):

My guess is that most people will find the image on the right more appealing (it is for me at least). Please drop me a comment with a reason if you differ. When I look at the image on the right, it feels a little more familiar, there are some experiences in my mind that I can relate to the image - for example looking straight up at the sky through a canopy of trees (white = sky, black=tree leaves), or a splatter of semisolid food in the kitchen.

In order for an object to be appealing, the beholder must have some side information, or familiarity with the object beforehand. I learnt this lesson the hard way. About 2 years ago, I gave a talk at a premier research institution in the New York area. Even though I had received complements when I'd given this talk at other venues, to my surprise, this time, audience almost slept through my talk. I learnt later that I had made the following mistake: in the abstract I'd sent to the talk's organizer, I had failed to signal that my work would likely appeal to an audience of information theorists and signal processing researchers. My audience had ended up being a bunch of systems researchers. The reason they dozed through my talk was that they had just a bit less than the required background to connect the dots I was showing them.

It is the same with cultural side information or context -- the familiar portion of the object allows the observer to latch on. The extra portion is the fun. Without the familiar, there is nothing to latch on to. The following phrases suddenly take on a precise quantifiable meaning:

  • "Beauty lies in the eyes of the beholder": the beholder carries a codebook that allows her to compress the object she is observing. Each beholder has a different codebook, and this explains 'subjective taste'.
  • "Ahead of its time": Something is ahead of its time if it is very good but does not have enough of a familiar portion to it, to be appreciated by the majority of observers.

I can think of lots of examples of art forms that deliberately incorporate partial familiarity into them -- e.g. music remixes, Bollywood story lines. Even classical music first establishes a base pattern and then builds on top of it. In this TED talk, Kirby Ferguson argues that all successful creative activity is a type of remix, meaning that it builds upon something familiar.

Takeaways:

  1. When writing a paper or giving a talk, always make sure the audience has something familiar to latch on to first. Otherwise, even a breakthrough result will appear uninteresting
  2. Ditto for telling a story or a joke, DJing music at a party, or building a consumer facing startup. Need something familiar to latch on to.
  3. In some situations it may be possible to gauge the codebook of the audience (e.g. having a dinner party conversation with a person you just met), to make sure you seem neither too familiar, nor too obscure.

Learning using Puzzles

Mathematical and logic puzzles are a great way to learn. I think they should be used to a greater extent in school and college education than they are, because they:

  1. Create an excitement in the mind of the student/solver.
  2. Force the solver to think about what method to use to attack the problem. This is much closer to real world situations, where one is not told about what area a problem is in. Tech companies recognize this when they use puzzles  while interviewing job candidates as a way to observe how the candidate thinks.
  3. Over time, allow students to recognize patterns in problems and the best way to attack them, so that wehn they are faced with a new, real world problem, they have a good idea of what to try first.

I am certainly not the first one to note the educational value ofpuzzles. Martin Gardner, whose puzzles I have tremendously enjoyed has himself advocated for puzzle-based learning.

Each puzzle is usually based on one (or, sometimes two, but very rarely more than two) fundamental areas of math: probability, number theory, geometry, critical thinking etc. For example, take the following puzzle (try these puzzles before reading answers at the end of the post):

[1] By moving just one of the digits in the following equation, make the equation true:

$latex 62 - 63 = 1$

It should be clear that this puzzle is part math and part logic.

Often there is more than one way to arrive at the solutions -- e.g. using a geometric visualization for a algebraic problem in 2 or 3 variables. For example take the following puzzle:

[2] A stick is broken up into 3 parts at random. What is the probability that the thee parts form a triangle?

One may choose to solve the above problem using algebra, but it can be visualized geometrically too.

Speaking of geometric methods, there is also something to be said about solving puzzles inside one's head, rather than using pen and paper. I feel it adds to the thrill because it can be more challenging. It is also fun to visualize objects and their relationships in mathematical terms,  completely inside one's head. The following is an example of a pretty hard puzzle, which was solved entirely inside one mathematicians head :

[3] Peter is given the product of two integers both of which are between 2 and 99, both included. Sally is given only their sum. The following conversation takes place between them:

P: I cannot figure out the two numbers from the product alone.

S: I already knew that you couldn't have figured them out.

P: Oh, now that you say that, I know what the two numbers are.

S: I see. Now that you say you know what they are, I also know what they are.

What are the two integers?

___

ANSWERS:

[1] $latex 2^6 - 63 = 1$

[2] Without loss of generality, assume that the stick is of unit length. Since the three lengths add up to 1, and are each $latex \in [0,1]$, they can be treated as three random variables $latex (x,y,z)$ in 3D space, lying on the 2-dimensional triangular simplex with vertices at (1,0,0), (0,1,0) and (0,0,1). The problem now boils down to determining what fraction of the surface of this triangle corresponds to tuples that cannot form a triangle. From the triangle inequality, we know that $latex x + y > z$ and so on for the other two pairs. A little mental visualization reveals that these boundary conditions are met when each side of these inequalities is $latex \frac{1}{2}$. The region corresponding to feasible points is a smaller equilateral triangle inscribed within the simplex and is one-fourth the area of the simplex. The answer therefore must be $latex \frac{1}{4}$.

[3] The answer can be found in this PDF of a note written by EW Dijkstra, who solved the problem in his head.

Andre Geim

"Many people choose a subject for their PhD and then continue the same subject until they retire. I despise this approach. I have changed my subject five times before I got my first tenured position and that helped me to learn different subjects." - Sir Andre Konstantin Geim (2010 Nobel Prize winner in Physics).

You are my hero Andre.

Make webcomics easily

I have wanted to draw webcomics for at least the last 3 years, but never got around to figuring out an easy way to do it. Drawing on a computer with a mouse is a disaster. There are special tablets like Wacom's that artists use (I believe xkcd uses Wacom's tablet). There are also some high-end, expensive iPad apps (e.g. brushes). But I wanted to do very simple drawing and without spending any money. I finally decided to put together a easy to use method for drawing simple black & white comics. I think the simplest way to draw a webcomic is low-tech:

  1. Draw on paper with a black marker.
  2. Take a picture with your phone and upload to a computer. Unfortunately, a picture taken by a phone is not simple b&w, but contains many colours, graininess of the paper, the effect of lighting, etc. Therefore:
  3. Run the picture though a pixel thresholder.

I wrote a small python program to threshold the value of each pixel and make each pixel either pure white or pure black. The result allows me to go from the JPG picture on the left (taken with an iPhone 3GS) to the PNG image on the right:

Each pixel in the left image is simply an integer between 0 and 255 (after converting to grayscale). Now, this should be easy to do using a basic image editing program by 'converting to B&W', but the caveat is that one cannot simply use a threshold of 128 because the pixels corresponding to blank space on the paper may not necessarily be below 128. The result of using a fixed threshold of 128 was a disaster:

There is probably a menu option somewhere on a good image editor to do what I want, but I don't want to deal with image editors, and besides, I want to automate webcomic creation as much as possible. So I settled on running k-means clustering on the pixel values with k=2 clusters. This gives two clusters, one corresponding to the blacks and one corresponding to the whites in the image:

Histogram of pixel values (0-255). Cluster of blacks on the left with centroid at 16. Whites on the right, centroid at 110.

The average of the two centroids can be taken to be the threshold. The threshold for the example above came out to be 63 -- pretty far from 128! By the way, the mean pixel value is 103, which would clearly not have worked as well as the mean of the cluster centroids did.

This code may be useful to others who want to put up their comics/doodling on the web. Unfortunately, the program has a number of python module dependencies (Scipy, PyPNG, Numpy), because I just wanted a quick hack. But I'll put up the code here anyway. Someday, I'll get a github account.

[sourcecode language="python"] import numpy, png, os, sys, pylab as plt import scipy from scipy.cluster.vq import vq, kmeans

infile = sys.argv[1]

print 'Converting to PNG...' #Make sure input is a png pngfile = infile.split('.')[0]+'.png' os.system('convert ' + infile + ' ' + pngfile)

print 'Reading in pixels...' f = open(pngfile, 'rb') r = png.Reader(file=f) k = r.read() l = list(k[2]) numplanes = k[3]['planes'] l = zip(*l) numrows = len(l) l = l[0:numrows:numplanes] # This will skip every numplanes element l = zip(*l) f.close()

# Create a 1-D array of all the pixel values l = numpy.array(l) pixels = [] for i in range(0, len(l)): for j in range(0, len(l[0])): pixels.append(l[i][j]) pixels = numpy.array(pixels)

print 'Computing threshold...' # Find the adaptive threshold by running one-dimensional k-means, with k = 2, and find the mean of the two cluster centers. j = scipy.cluster.vq.kmeans(pixels, 2) adaptivethreshold = (j[0][0] + j[0][1])/float(2)

print 'thresholding...' # now threshold for i in range(0, len(l)): for j in range(0, len(l[0])): if l[i][j] < adaptivethreshold: l[i][j] = 0 else: l[i][j] = 255

print 'Writing to file...' # Now write: thresholdfile = pngfile.split('.')[0]+'-threshold.png' f = open(thresholdfile, 'wb') w = png.Writer(len(l[0]), len(l), greyscale = True) w.write(f, l) f.close() [/sourcecode]

Fresh stock of puzzles

The past few weeknights have been spent solving puzzles from Martin Gardner's books, which I really enjoy. I just came across a book by Peter Winkler, whose name I had heard before in talks by Muthu. After reading a few of the puzzles on Amazon's look inside preview, this book is too hard to resist buying. Also, here [PDF] is a beautiful set of 7 puzzles compiled by Peter Winkler on the occasion of a remembrance for Martin Gardner in 2006.  Other places I have found good puzzles recently include this page maintained by Gurmeet Singh Manku. Puzzles + coffee + weekend afternoon = awesomeness.