Koala bears dunking basketballs? Meet the new AI tech creating realistic images from any description

Published Apr 14, 2022 at 11:58 AM

Reading time: 2 minutes

Images courtesy of OpenAI

AI research laboratory OpenAI is currently making headlines again for its latest invention DALL·E 2—or should we say, for its update to the success that was its predecessor, DALL·E. To put it simply, DALL·E 2 is a machine learning model that can generate stunning images from text descriptions in less time than ever before.

As many of you probably know, AI systems being able to create images and art from text is nothing new. We’ve previously seen how the technology could help transform your favourite celebrities into Disney characters or even how it could reimagine your childhood cartoon characters as humans. But DALL·E 2, when compared to other machine learning models out there, can generate more realistic and accurate images with four times greater resolution thanks to OpenAI’s advanced deep learning techniques. Not to mention the images are scarily realistic.

@karenxcheng
A demo of DALL-E 2 🤯 will be doing a live demo and taking audience requests, see my IG for details #dalle2 #openai #artificialintelligence
♬ original sound - Karen X

Announced on 6 April 2022 on the company’s Twitter account, the reveal of DALL·E 2 was accompanied by a thread of wonderful photos which quickly grabbed the attention of the platform’s users.

Here’s a look at what DALL·E 2 can do. 👀🧵👇

Want to see more? Follow along on Instagram: https://t.co/eKejIFxWdn
— OpenAI (@OpenAI) April 6, 2022

A short video that successfully harnessed the power of the technology—just letting you know, you’re about to see a Panda bear ice skating and a Koala bear dunk a basketball—and what it can do followed the thread:

Our newest system DALL·E 2 can create realistic images and art from a description in natural language. See it here: https://t.co/Kmjko82YO5 pic.twitter.com/QEh9kWUE8A
— OpenAI (@OpenAI) April 6, 2022

Like every milestone OpenAI announcement, DALL·E 2 comes with a detailed paper as well as a unique landing page for enthusiasts to read more about how the machine learning model works. A video providing more of an overview of what the technology is capable of doing and what its limitations are has also been shared online.

If you’re not sure you’re up for reading the full paper on DALL·E 2, we’ve got you covered. To put it as simply as possible, the system is a generative model, which is a special branch of machine learning, that creates complex output instead of performing prediction or classification tasks on input data. Basically, you provide DALL·E 2 with a text description and it generates an image that fits the description.

Generative models are a hot area of research that received much attention with the introduction of generative adversarial networks (GAN) in 2014. You probably remember websites such as ThisPersonDoesNotExist or ThisStartUpDoesNotExist, right? Well, these were early examples of what GANs could help us do.

Since then, however, the field has seen tremendous improvements, and generative models have been used for a vast variety of tasks including creating artificial faces, deepfakes, synthesised voices, and more. What sets DALL·E 2 apart from other generative models is its capability to maintain semantic consistency in the images it creates. This consistency is clearly shown in OpenAI’s shared examples.

Let’s say you enter the description “Teddy bears shopping for groceries in the style of ukiyo-e” into the AI system. DALL·E 2 will offer you those images (among many others).

Now, imagine that you changed the description you’ve given it slightly to “Teddy bears shopping for groceries in ancient Egypt,” because why not? In only a few seconds, DALL·E 2 would give you some of these instead:

But what if you wanted those cute bears to be shown shopping again, only using a one-line drawing style instead?

The point is, the model remains consistent in the examples it offers you depending on the text prompt you give it. This same consistency shows itself in most examples OpenAI has shared so far. Furthermore, DALL·E 2 seems to understand depth and dimensionality, a great challenge for algorithms that process 2D images.

Even if the examples currently shown on OpenAI’s website were cherry-picked, they are still impressive. In fact, to prove how good the technology is, OpenAI’s CEO Sam Altman took to Twitter and asked users to suggest prompts to feed to the generative model. The results (see the thread below) are fascinating.

A rabbit detective sitting on a park bench and reading a newspaper in a victorian setting
— Mikko Kuitunen👾🎨🌍🇫🇮 (@MikkoKuitunen3) April 6, 2022

All that said, it remains to be seen how deep DALL·E 2’s commonsense and semantic stability goes, and how its successors will deal with more complex concepts such as compositionality. In the meantime, though OpenAI’s website invites enthusiasts to get on their waitlist for DALL·E access, in a blog post, Altman suggested a possible product launch as soon as this summer. So start prepping your written descriptions for an AI girl summer.

Keep On Reading

By Abby Amoakuh

Trump launches Bible as election newcomer Literally Anybody Else enters the presidential race

By Abby Amoakuh

From Disney star to space start-up CEO, here’s everything you need to know about Bridgit Mendler

By Abby Amoakuh

Comedian Arj Barker responds after throwing breastfeeding mother and baby out of his show

By J'Nae Phillips

Why Harajuku fashion is making a comeback in both Gen Z culture and aesthetics

Technology

AI

Koala bears dunking basketballs? Meet the new AI tech creating realistic images from any description

By Alma Fabiani

Published Apr 14, 2022 at 11:58 AM

Reading time: 2 minutes

Copy Link

Stay ahead of the curve,

sign up to the SCREENSHOT newsletter.

Keep On Reading

By Abby Amoakuh

Trump launches Bible as election newcomer Literally Anybody Else enters the presidential race

By Abby Amoakuh

From Disney star to space start-up CEO, here’s everything you need to know about Bridgit Mendler

By Abby Amoakuh

Comedian Arj Barker responds after throwing breastfeeding mother and baby out of his show

By J'Nae Phillips

Why Harajuku fashion is making a comeback in both Gen Z culture and aesthetics

By Charlie Sawyer

O.J. Simpson dies at the age of 76 following a battle with cancer

By Abby Amoakuh

Bridgerton’s casting director reveals why her inbox regularly gets flooded with NSFW audition tapes

By Alma Fabiani

Brothers jailed for recording themselves torturing poor animals in sadistic attacks

By Malavika Pradeep

Kidcore: A dive into the colourful and nostalgic aesthetic trend

By Charlie Sawyer

New York Attorney General sues some of the US’ biggest pro-life groups for unsafe Abortion Pill Reversal treatment

By Abby Amoakuh

Anti-abortion OBGYN who said 9-year-olds are fine to give birth appointed to maternal death committee

By Charlie Sawyer

Jacob Elordi accused of grabbing radio employee’s throat over Saltburn bathwater prank

By Charlie Sawyer

Jake Paul puts full weight behind Donald Trump with $10,000 donation to the Republican’s campaign

By Fatou Ferraro Mboup

Grindr sued for allegedly sharing UK users’ HIV status with ad firms

By Charlie Sawyer

Kinky, colourful, and queer: Why Chappell Roan is the most exciting thing in pop since Lady Gaga

By Charlie Sawyer

Gather around girlies: Here’s what to expect from the UK general election result

By Fatou Ferraro Mboup

Netflix’s depiction of Griselda Blanco was wrong. Why the cocaine godmother was not a feminist icon

By Charlie Sawyer

Who is Claudia Sheinbaum, the scientist set to become Mexico’s first woman president?

By Fatou Ferraro Mboup

WWE star invites Drake and Kendrick Lamar to settle their beef in the wrestling ring

By Charlie Sawyer

25-year-old Republican politician tells Americans not to be weak or gay in campaign video

By Fatou Ferraro Mboup

Ryan Bayldon-Lumsden is the murder suspect standing for re-election in Australia