RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2
2018/05/31 14:52:22

Generative adversarial network (GAN)

Content

Main article: Generative artificial intelligence

2018

Nvidia AI creates images of human faces that are 100% similar to real

In December 2018, Nvidia specialists Tero Karras, Samuli Laine and Timo Aila published a document and an accompanying video demonstrating what their artificial intelligence can do when creating human faces.

The paper, published in arXiv, describes a new architecture for creating and mixing images, in particular human faces, which "gives better interpolation properties as well as better reads hidden variations."

This means that the system is more aware of important differences between images and different scales. For example, the previous system could create two "dissimilar" faces that were for the most part the same, except that the ears on one were invisible and the shirts were different colors. These are not distinctive features, but the system did not know exactly what details to focus on.

In the new architecture, much attention is paid to the transfer of style, when important stylistic aspects, for example, paintings are extracted and used to create a different image. In this case, "style" is not so much smears or color space as composition (in the center, look left or right, etc.) and physical characteristics of the face (skin tone, freckles, hairstyle). Features can also have different scales. At the detailed level, these are individual features, at the middle level - the general composition of the frame, at the highest level - aspects such as the color palette. By making changes at once at all levels, the system completely changes the image, while adjusting only individual levels can lead to a change in hair color or the presence of freckles or facial hair.

Neural networks have learned to create portraits of non-existent people

In addition to faces, AI from Nvidia is also capable of generating images of cars, cats and landscapes, since they are based on a largely similar algorithm for highlighting features of the lower, middle and higher levels.

Nvidia's approach is aimed at creating a generative-adversarial network (GAN), where training takes place to create completely new images that simulate the appearance of real photos.[1][2]

How generative-adversarial networks were born

In the history of technology, polynomial precedents are known when an outstanding invention, contrary to the will of the author, begins to be used to harm, and significantly earlier than for good. And that's a rule rather than an exception. Moreover, if earlier this pattern was traced mainly in military applications, which gave rise to politicians and generals to assert that war is an engine of progress, today, thanks to information technologies, the scope of such low-grade techniques has noticeably expanded.

The Generative adversarial network (GAN) algorithms, which radically strengthened the potential of machine learning, do not constitute exceptions. For several years, GAN networks have become the subject of research by a large number of scientific teams, from which it follows that obtaining practical results is not far off, but so far the perverse use of GAN in the form of deepfakes (deep fakes) is spreading at an advanced rate.

With deepfakes, you can generate fake images and videos indistinguishable from natural ones, and use them to cause various kinds of scandals, including political ones. Expressing its disapproval of deepfakes, one has to admit that without these technologies transforming the faces of political leaders and the nudes of film stars, GAN remained hidden in the depths of academic and corporate research, the results were reported at scientific conferences and published in special journals.

Ян Гудфеллоу (Ian Goodfellow) превратился в культовую фигуру AI celebrity, now referred to as GANfather (GAN father)

GAN networks emerged from an insight that dawned on a graduate student at the University of Montreal, a "good companion," which follows from his last name Goodfellow, in 2014. It is symptomatic that this insight happened in the beer room, and, as you know, beer rooms play a special role in computer history. San Mateo had one where the creators of the first PCs, members of the Homebrew Computer Club, gathered.

So, sitting at a beer in the famous Montreal zucchini "Three Breweries," friends Ian Goodfellow complained to him about the difficulties they had when trying to generate reliable images of human faces, the pictures turned out to be blurred, and sometimes they even lacked such important details as eyes or ears. To improve the quality, they planned to use information obtained from statistical analysis of a huge number of real photographs. Goodfellow upset his friends by noticing that they would need huge computing power, which means that nothing would work for them. He proposed to go the other way, namely, to use the second neural network in order to "cut off" both networks so that in the dialogue they form images of the required quality.

Friends did not take their word for it, then, angry, immediately upon returning home, on the same night Goodfellow came up with an unparalleled machine learning algorithm without a teacher, later called the Generative Adversarial Network (GAN). In it, two networks work in pairs, one generates samples, and the other seeks to distinguish the right samples from the wrong ones. Then, in the process of co-learning, an equilibrium state is achieved, when both networks have significantly improved the picture quality and now the generated images can look almost like real ones.

The key thought laid down by Goodfellow in GAN is that they are not one, as is customary, but two networks train on the same data set at once. The first, called a generator, creates realistic images whenever possible, while the second, the discriminator compares them with the original ones and filters unsuccessful ones. The results obtained by the discriminator are further used to train the generator. It is very important that the efforts of both networks are balanced. This unity of creative and critical beginnings is very typical of creative partners, such as author and editor, artist and critic.

We can say that GAN added the ability, relatively speaking, to the imagination to the recognition capabilities of machines. GANs take machine learning to the next level, today networks are trained with supervised learning on a tremendous amount of educational data, and the creation of GANs has become a serious step towards unsupervised learning. In the future, a robot car will be able not only to analyze the current situation and respond to it, following the instructions of a previously trained network, but also to independently accumulate knowledge during the movement and even while standing in the parking lot, drawing data from the network.

Note that in Russian-language materials, for example, on Wikipedia, the learning process is called "competitive." This is nothing more than a translation exactly the opposite. Concurrent has nothing to do with competition, as they call something converging at a point, or having a common point, or intersecting at a point. As for the word adversarial, in this context it should be understood as adversarial, in the image and likeness of a lawsuit, where the two parties productively oppose each other in the process of searching for truth.

The GAN idea proposed by Goodfellow was instantly picked up by his close colleagues. Recognized machine learning authority Yann Lekun, now a principal AI researcher working for Facebook, called GAN the most outstanding idea proposed in the field in the last 20 years. A couple of months later, an article by a group of employees of the University of Montreal on GAN was published [3]in the beer room. It caused an explosion of subsequent studies and stimulated an avalanche-like increase in the number of articles.

As a result, as in a fairy tale, a graduate student turned into a cult AI celebrity figure overnight, now he is called GANfather (GAN's father), regardless of his modesty and youth.

Ian Goodfellow owes his success not only to the apple, which in this case took the form of a beer mug, but also to the fact that he worked in the "small homeland" of deep learning, under the direct guidance of Joshua Bendjo, who, along with Yann Lekun, Andrew Un and Jeffrey Hinton, was part of the leadership of the very "Canadian mafia" that made a scientific coup in the field of machine learning, offering deeplearning.

Now Goodfellow is thriving, working for the Google Brain team, he still hasn't experienced a change in his status and what worries him most is that he now has to spend his main efforts fighting the malicious use of GAN, including deepfakes.

'Canadian mafia 'four who carried out scientific machine learning coup by offering deeplearning

Already today, GANs are used in the largest nuclear centers to predict particle behavior. There are still many other serious areas, but at the initial moment two are popular:

  • Improved image quality, which is critical in cases where it is difficult to obtain the required quality during shooting, for example in medicine (Photo-Realistic Single Image Super-Resolution).
  • Text to Image Synthesis.

Deepfakes technologies pose the main threat from GAN, and it seems that in Russia, apart from porn, they do not notice anything, it is enough to watch such a seemingly non-stupid publication as Meduza.[4]. Indeed, if you search Yandex, then first of all pages with the word porn will fall out.

In the United States, the attitude towards threats created through GAN is much more serious. Suffice it to say that the Atlantic magazine published for 160 years, where, for example, in 1945 one of the most important articles for computers was published, "As we may think" by Vannevar Bush, leaned back on what was happening with The End of Reality. This is to a certain extent a philosophical article, where the author reflects on the consequences for society of the fact of the loss of documentary forms of representation of reality. Fake-but-realistic videos launched into television can lead to unpredictable consequences.[5]

The real danger of false news in 2016 was reported by The New York Times.[6] succumbing to a false report about preparations for a nuclear attack by Israel, the Pakistani Minister of Defense quite seriously tweeted his readiness to launch a nuclear missile strike on an unsuspecting country. Then the social network was cleaned, as they say: "spoons were found, but the sediment remained."

At the level of the US Department of Defense, the fight against the malicious use of machine learning technologies was led by DARPA. In 2016, it launched a special MediFor program aimed at countering such threats. But it seems that she was not very successful. This summer, DARPA is holding a new competition, but rather a brainstorming session, AI fakery contest, where leading experts will come. They will create videos, audio recordings and fake detection tools. But, according to one of the participants, time has been missed and it is time to start a new arms race.

File:Aquote1.png
Defending democracy determines urgency, "he said.
File:Aquote2.png

Notes

  1. [1] These face-generating systems are getting rather too creepily good for my liking A Style-Based Generator Architecture for Generative Adversarial Networks
  2. [2]
  3. [3]
  4. Deepfakes: porn in which a neural network adds celebrity faces. It is prohibited by all major services, even PornHub
  5. The Era of Fake Video Begins
  6. MediFor