Note (February 5): In the original version of this story, I compared a low-quality copy of the 1896 film to the scaled-up version. Shiryaev actually started with a higher quality scan of the film, and many of the differences I saw reflected his better source material, not the upscaling algorithm. I updated the first video below to the one that Shiryaev used, but I left the text of the story as it is.
Arrival of a train in La Ciotat is one of the most famous films in cinema history. Shot by the French filmmakers Auguste and Louis Lumière, it reached a level of quality unprecedented for its time. Some people consider the commercial exhibition in 1896 to be the birth of the motion picture industry. An urban legend—probably apocryphal—says viewers found the images so realistic that they screamed and ran to the back of the room as the train approached. I’ve embedded a video of the original movie above.
Of course, humanity’s standards for realism have risen dramatically over the past 125 years. Today, the Lumière brothers’ masterpiece looks grainy, cloudy, and actually old. But a man named Denis Shiryaev used modern machine learning techniques to scale up classic film to 21st century video standards.
The result is remarkable. When you look at the scaled-up version, the world of our great-great-great-grandparents comes to life. Previously cloudy details of the train, the clothes and the faces of the passengers now stand out clearly.
How did Shiryaev do? He says he has used commercial image editing software called Gigapixel AI. Created by Topaz Labs, the package allows customers to scale up images up to 600 percent. Using advanced neural networks, Gigapixel AI adds realistic details to an image to prevent it from looking blurry when scaled up.
As the name implies, neural networks are networks of artificial neurons – mathematical functions that convert a series of input values into an output value. The main feature of neural networks is that they can be trained: if you have some sample inputs whose “correct” outputs are known, you can fine-tune the parameters of the network to increase the chance that the correct answers are given. The hope is that this training will generalize – that once you train it to produce the correct answer for inputs the network has seen before, it will also produce correct answers for inputs it has not yet seen.
To train a network, you need a database of examples where the correct answer is already known. Sometimes AI researchers have to hire humans to manually produce these correct answers. But for upscaling images, there’s a handy shortcut: you start with high-resolution images and downsample them. The low-resolution images become your input, and the high-resolution originals serve as the “correct” response the network wants to produce.
“A neural network analyzes thousands of photo pairs to figure out how details are usually lost,” explains Topaz Labs on their Gigapixel AI product page. “The algorithm learns to ‘fill in’ information into new images based on what it has learned, effectively adding new details to your photo.”
Show the neural network a low-resolution image of a face and it will figure out that it is a face and fill in the correct details for the subject’s eyes, nose and mouth. Show the neural network a low-resolution brick building and it will add an appropriate brick pattern in the high-resolution version.
An obvious next step would be to colorize the video. Neural networks can do that too with the same basic technique: start with some color photos, convert them to black and white, then train a neural network to reconstruct the originals in color.
I dropped a frame from Shiryaev’s video into the Colorize Images app for Android, which uses machine learning to color images automatically. As you can see, it does a pretty good job of rightly concluding that trees should be green, gravel a brownish color and men’s jackets should be black. I’d love to see someone with more time and better tools colorize Shiryaev’s scaled-up version of the Lumière Brothers classic.