Lets dream about an idea in our next reading-powernap. The last week I often found myself looking at weird pictures on Twitter with the hashtag #DeepDream, like this one:
At first glance the image just looked like a crappy compressed otherwise nice picture. But the second time I recognized the Chinese temple in the mountain which wasn’t there in the first place. This picture is taken from a post on the Google Research Blog about an algorithm called Deep Dream. So what is this Deep Dream thing and what does it do? The blog goes in great detail on that with another post “Inceptionism: Going Deeper into Neural Networks”. Essentially it is all about the Artificial neural Networks (a.k.a. Deep Learning / Machine Learning). So how does all this work?
How does DeepDream work?
Just a heads up: I’m trying to explain the concept behind it. I am certainly not 100% right on everything 😉
Here’s an excerpt from the second post:
We train an artificial neural network by showing it millions of training examples and gradually adjusting the network parameters until it gives the classifications we want. The network typically consists of 10-30 stacked layers of artificial neurons. Each image is fed into the input layer, which then talks to the next layer, until eventually the “output” layer is reached. The network’s “answer” comes from this final output layer.
For those interested in those things you should definitely check out the post.
For those TL;DR guys: Human beings sit in front of machines and teach them the difference between words, accents or pixels/perspectives and add links and meanings to the information. So the first step is for the machine to recognize a banana in a picture as a banana. Or a fork as a fork. A fork for example has a handle and usually between 2 to 4 lines so it can be used as such but colors, shape, size or perspective don’t matter.
The second step is letting the machine visualize what it understood as a fork and create an image from scratch. The third and most interesting step (which takes us back to DeepDream) is the interpreting of a random image that is given to that system. Through artifacts, contrast and/or objects in the image the machine can find different shapes in an image that wouldn’t be visible at first. That’s something which was mostly human mankind domain until now, creativity in shapes and forms. Although the machine has to learn (like a child) to associate the shape of a fish with the animal, it can find or overinterpret those shapes itself later on. While analysing it is taking a layer approach, first recognizing contrasts and then going deeper and deeper into the image and into its slight variations in hue, or contrast:
The final step is how many iterations the program should run and by that create a totally different image or even images from random noise.
The images above for example were created by a network which was trained on places, landscapes and cities and through that the system found landscapes and cities in random noise. And you can do that all yourself with DeepDream 🙂 I like the random noise results the most and Twitter is already full of stuff on #DeepDream, check it out at the link or get the code and try it 🙂
Another point of view
The Verge published an article about DeepDream aswell. The DeepDream algorithm was adopted to video (check out the source here) and here is a scene out of the famous drug-centred movie “Fear and Loathing in Las Vegas” within the DeepDream:
TL;DR2: Humans teach machines to observe the world like human beings and to play with shapes, try yourself.