OpenAI’s AI can generate an image from a sentence

On January 5, 2021 OpenAI presented two highly innovative systems for artificial intelligence. DALL-E, a system that can create images from simple text, and CLIP, a second model that can learn to recognize categories of objects very quickly.

DALL-E: A model based on GPT-3

Maybe your brain made the connection, but here is the explanation of the name DALL-E: it is indeed the contraction of the name of surrealist artist Salvador Dali and the robot WALL-E. A great find that fits perfectly with DALL-E’s raison d’etre, if there ever is. This artificial intelligence model has a simple task: to create an image from a text. If this indeed seems very simple on paper, it is still an extremely complex task for a robot.

DALL-E is based on GPT-3, a language model developed by OpenAI. The company explains, “The GPT-3 model demonstrated that language can be used to direct a neural network to perform various types of text generation. The GPT image showed that the same type of neural network can also be used to generate high fidelity images. We extend these results to show that the manipulation of visual concepts through language is now within reach. “

This model had a monumental bad mood this fall. While it was being used as a medical chatbot, the artificial intelligence advised a patient to commit suicide … OpenAI had nevertheless warned Nabla, the company that used GPT-3 in this context, saying: “belongs to the high stakes category, because people rely on accurate medical information to make life and death decisions, and mistakes in this area can lead to serious harm. “

The AI ​​is able to create breathtaking variants

Whatever happens, DALL-E is unlikely to make such mistakes. At least they certainly won’t have the same consequences. GPT-3 is a template originally created to automate text writing and based on 12 billion parameters. An image can be generated from just a few keywords. The model is also based on hundreds of millions of images and their captions.

For example, DALL-E has succeeded in generating the following image from this text: “Illustration of a white radish baby in a tutu walking a dog”. An example that proves that artificial intelligence can perform complex tasks to illustrate a particularly twisted idea. DALL-E can edit and rearrange objects in generated images.

Image generated by DALL-E

It’s interesting how much a single idea can allow artificial intelligence to produce multiple illustrations with small differences. For most of the tests that were carried out, the result is quite good. However, we don’t yet know exactly what DALL-E could be used for in the real world. For this reason, OpenAI promises to organize a new conference shortly to detail the goals and uses of its latest invention.

