Censorship and Manipulation in the Realm of Artificial Intelligence
Many have heard the historical tale of “the vanishing commissar”. The anecdote stands perhaps amongst the best examples of the seemingly impossible feat of deleting a person from history. The commissar is Nikolai Yezhov, head of Joseph Stalin’s secret police, while the vanishing occurs just after falling on the communist dictator’s bad side. Stalin, with the help of expert manipulators, erased all traces of the man’s presence from all U.S.S.R. records and pictures, airbrushing him out of all photographic evidence of his existence. It was as if the man had never existed. That was 78 years ago. Today, budding artificial intelligence algorithms promise to make that same task as easy as the click of a button.
Enter Deep Angel, fresh off the shelves of MIT Media Lab’s Scalable Cooperation team. Idealized and directed by Matt Groh, the project originally formed as a by-product of object recognition research. The already massive (and continuously developing) body of object recognition artificial intelligence algorithms are generally geared towards exactly what the name says: recognition. By feeding an artificial intelligence “labeled” pictures (yes, even in jpeg format) of various objects, the machine learning within it will match the distribution of different colored pixels to the “label” with the highest similarity, recognizing it. The basis for this technology, as with most artificial intelligence applications, rests on a learn-as-you-go paradigm, where an algorithm becomes increasingly accurate in object recognition, the wider the range of exposure it receives. If that weren’t incredible enough, given most of the algorithms and picture databases are open-source and available to all, object recognition algorithms are shifting from being the final product, to simply becoming the mechanical basis. The accessibility of this technology, in short, is fueling its proliferation.
Deep Angel stands as a wonderful depiction of this continuously expanding rise to stardom. When Deep Angel is fed an image, its first step is to do exactly what has just been described above. It dissects the image, scanning for consistent pixel pattern distributions, matches them with those in its labeled database, and essentially gives a name to all objects it recognizes in the picture. The novelty arises in what it does next. Once all available objects are recognized, it provides the user with the possibility of simply removing one.
To do this, and to make it look natural, Deep Angel grabs the contour of the figure and basically cuts it from the image, as if engaged in a simple collage. It then returns to all other objects it had recognized as potentially laying in the image’s background, and engages in generative inpainting: essentially stretching the background over the “hole” by using its nearest pixel distributions in the same way it had done before. The result, as illustrated in the gifs below, is quite staggering. Humans, even within in immensely diverse environments, can be seamlessly removed from an image with the click of a button.
Let’s pause for a moment. If you have any knowledge about the world of Adobe at all, you might not think of any of this as by any means impressive. “We’ve been able to do this kind of stuff with Photoshop for a while now,” you might say. That much is true, programs like Photoshop would have made the commissar vanish with much greater ease than Stalin’s propagandists did. However, there are multiple key differences. For one, Deep Angel is automated: that is, no real labor has to be put in for a removal - the AI recognizes an object, and blends the background to match what its database deems as most realistic given pixel distributions. Because of that, it does not bend to the decree of the manipulator, but only to that of probabilistic theory. In essence, it’s an algorithm — an absolute automatization that can easily be applied within other systems as well, without the need for manual labor. Inherently, then, it is not only easier but infinitely more precise and applicable in other fields.
In order to illustrate the capabilities of their platform, the Deep Angel team tried to do just that, applying its technology past the “aesthetics of absence” history has shown us. As depicted in the video below, titled “The Broken Flaneur”, the team of researchers has already applied the technology to video content as well, a far more difficult task. Here, we see the artificial intelligence essentially turning a group of boats into thin air. What’s amazing, here, is the absolutely seamless perfection of the video after the removal of the boats.
Now, this demo video, simple as it is, holds a great deal of value. For a first attempt at removing objects from a video, it is wildly successful. However, if the course of development of artificial intelligence algorithms has shown us anything, it’s that it is in their nature to build upon one another. As its machine learning becomes more fine-tuned, and the underlying generative inpainting approaches optimization, the technology developed here can easily be thrown within a number of fields with immense social relevance. People and objects will be seamlessly removable from the likes of, for example, live video coverages. With greater computational power and speed, this technology may even come to be applied as a “filter”, placed over a live image, ready to recognize, target, and remove, any object or person it’s able to recognize. And, with time, as is the nature of machine learning, it can only improve.
We live in a reality very far removed from the days of the “vanishing commissar”. However, in the uncertain era of fake news, media manipulation remains amongst the most relevant areas of discourse for populace, government and press alike. The brilliant tools artificial intelligence provides can have immediate negative repercussions. The ease and seamlessness of manipulating an image, of removing a subject from a scene, has the possibility of creating an incredibly sizable body of entirely distorted content which, thanks to our ever growing connectedness, can find refuge in any of the internet’s nooks and crannies. In that case, our conception of the content we run across online is going to have to change. Once images are fully manipulable with such alarming fidelity and ease, our conception of authenticity is going to have to transform.
In an effort to drive this exact point home, the team behind Deep Angel offers online didactic sessions and quizzes oriented towards providing an understanding of just how easy this kind of manipulation is and will be. Trying to tell, between two pictures, which is the original and which has been manipulated is already a difficult task as it stands, imagine a few extra budding developments from now. Stalin’s “vanishing commissar” will be a ridiculously easy stint, something that happens with two or three clicks and zero artistic skill. Yet, now like then, we must strive towards understanding this ease, shifting our perception to match our technology’s capabilities. Given the unstoppable nature of its rise, all we can do, much like an algorithm, is to learn, understand and adapt or forever be left in authentic uncertainty.