New AI DALL-E 2 creates images from text prompts

Culture
Design
Posted 14 April, 2022

Credit: OpenAI

A new AI system can create a variety of images both realistic and surreal from natural language descriptions. It has the potential to be game changing, but not without some concerns.

Algorithms and AI continue to make waves across the art industry.

The latest breakthrough comes from a research team called OpenAI, who has just unveiled a new version of its DALL-E program. This software is a text-to-image generation tool that produces artwork based on a user-inputted description.

View this post on Instagram

A post shared by DALL·E by OpenAI (@openaidalle)

Aptly called DALL-E 2, this new AI won’t be available to the public, but researchers can sign up to preview its capabilities. OpenAI says it plans to make its software available for use in third-party apps eventually – though no word on when this will happen.

For now, the program will be tested by vetted partners.

Users aren’t allowed to upload generated images that may ‘cause harm’, and must disclose what they’re using the AI for.

How does DALL-E 2 work?

While I am by no means a coding expert, I can confirm that DALL-E 2 operates using a pre-built image library. The algorithm is given a wealth of tagged pictures and then creates new artwork based on what it already knows.

Say you wanted to create an image of a tiger on a canoe. Weird, right? But DALL-E 2 will search its files to find what a ‘canoe’ and a ‘tiger’ both look like, and create a single piece that convincingly combines both.

View this post on Instagram

A post shared by DALL·E by OpenAI (@openaidalle)

DALL-E 2 builds on the first iteration’s CLIP, a computer vision system. OpenAI says that this new software generates images using ‘diffusion’, whereby a piece begins as a few dots and is gradually filled in with details.

This process happens via a two-stage model. CLIP first matches your text to other existing photographs and images, then a ‘decoder’ generates the picture itself.

The above video gives a brief demonstration of what’s possible, showing off AI-generated cats, realistically edited versions of pre-existing images, and a complex system of object labelling that allows DALL-E 2 to understand your prompts. It’s truly impressive stuff.

Interestingly, OpenAI stresses that there are still errors and issues to iron out.

View this post on Instagram

A post shared by DALL·E by OpenAI (@openaidalle)

Objects that are mislabelled could cause the algorithm to produce incorrect pictures that do not line up with the text description provided. If within its coding it has a pre-existing photo of a car labelled as a ‘plane’, for example, then this could lead the generator completely off course, sending back a BMW instead of a Boeing.

In addition, very specific prompts aren’t possible until the AI has labelled and learned what the relevant objects are.

Asking for a town or rare species of animal may result in wonky, incorrect images until the algorithm has been improved. Keep in mind this is only the second iteration of DALL-E, so we’ll no doubt see even more mind-bending demos in the future.

Why could this cause issues for artists?

After perusing the artwork created by DALL-E 2, it’s hard not to feel excited by the possibilities of the technology.

We should be mindful of potential pitfalls, however. Artists already have a very difficult time earning money for their work in the internet age – hence the initial reason for NFTs – and a new algorithm-based image tool could put many small-time digital illustrators out of business.

View this post on Instagram

A post shared by DALL·E by OpenAI (@openaidalle)

It’ll also become much harder to verify the authenticity of an image or painting online, and may devalue the work of genuine human beings. Instant image minting could become a possibility, creating an even more exploitative NFT market.

If everyone can make anything instantly, do illustrations and paintings lose all their commercial value? Does art itself become simply another application or tool for anyone to use?

View this post on Instagram

A post shared by DALL·E by OpenAI (@openaidalle)

There are big, existential questions as to the implications of such genuinely ground-breaking software, many of which we don’t have the answers to.

To OpenAI’s credit, it seems very aware of the dangers. It says that DALL-E 2 will never be fully available to the public, and will only be slowly rolled out to trusted researchers and partners based on feedback. Users will need to say why they’re using the software and cannot make any images that are obscene or harmful.

It wants to ensure that misinformation or deep-faked images do not end up causing further havoc to our political systems and online discourse too.

View this post on Instagram

A post shared by DALL·E by OpenAI (@openaidalle)

These intentions may be sound enough, but who’s to say that other, less well-meaning coders won’t simply copy OpenAI’s work? We’ve already seen one application called Wombo’s Dream launch last year, clearly based on this concept.

You can access it right now – though it is far less sophisticated than DALL-E 2.

Ultimately, we’ve no idea how this technology could impact the art world. What we do know is that things are getting scarily impressive, perhaps even a little uncanny valley. For now, OpenAI seems to be rolling out its products responsibly – and that’s the best we can hope for at this early stage.

Charlie Coombs

Senior Remote Writer Bristol, UK

I’m Charlie (He/Him), a Senior Remote Writer at Thred. I was previously the Editor at Thred before moving to Bristol in 2024. As a music and gaming enthusiast, I’m a nerd for pop culture. You can find me curating playlists, designing article headline images, and sipping cider on a Thursday. Follow me on LinkedIn and drop me some ideas/feedback via email.

More from thred.

Cancelling ‘The Late Show’ is a sign of troubling times

Credit: Wikimedia Commons

Media

Cancelling ‘The Late Show’ is a sign of troubling times

Stephen Colbert’s exit marks more than TV’s end. It signals the unsettling ‘Trumpification’ of mainstream media under political pressure. CBS’s decision to cancel The Late Show with Stephen Colbert – a late-night institution that, in Colbert’s own words, ‘isn’t being replaced’ – marks the end of a 33-year legacy and a moment of uneasy reflection for American media in the time of Trump. The network attributes the cancellation to longstanding financial...

By Flo Bellinger Brighton, UK

Are Gen Z ageing faster than their parents?

Credit: Thred

Insights

Are Gen Z ageing faster than their parents?

Gen Z are known for tempering harmful habits like drinking or smoking, but are they destined to age poorly anyway? If you’re chronically online, you’ll have seen self-deprecating videos from Gen Zers talking about how they’re ageing worse than millennials, but is there any genuine credibility behind the jokes about receding hairlines? There might just be. Experts are increasingly warning of a ‘generational health drift,’ in which future cohorts may have...

By Jamie Watts London, UK

Bad Bunny redefines what it means to tour

Thred/Euphoriazine

Media

Bad Bunny redefines what it means to tour

Instead of travelling the world and to perform in a new city every night, Latinx artist Bad Bunny is telling fans to come to his home country Puerto Rico. Earlier this year, Bad Bunny released his massively successful seventh studio album ‘DeBÍ TiRAR MáS FOToS’ (translated in English to ‘I should’ve taken more photos’) and announced he would be going on tour. But when the time came, he decided...

By Jessica Byrne London, UK

Opinion – the earthquake magnitude scale shouldn’t be logarithmic

Credit: Unsplash

Offbeat

Opinion – the earthquake magnitude scale shouldn’t be logarithmic

Full disclaimer: I’m no scientist. I’m a journalist trying to understand the logic behind measuring earthquakes logarithmically – meaning each whole number increase represents a quake that is 10x stronger – when this isn’t intuitive to the public. Shouldn’t we measure them out of 100 instead? In the early hours of July 30, a powerful 8.8 magnitude earthquake struck Russia’s Kamchatka Kari peninsula. It caused an volcano in the region...

By Jessica Byrne London, UK

A new AI system can create a variety of images both realistic and surreal from natural language descriptions. It has the potential to be game changing, but not without some concerns.

How does DALL-E 2 work?

Related articles

How can sleep make us happier?

BeReal is encouraging us to stop curating ourselves online

Why could this cause issues for artists?

Popular

Bad Bunny redefines what it means to tour

Opinion – the earthquake magnitude scale shouldn’t be logarithmic

Are Gen Z ageing faster than their parents?

Keep up with thred by signing up to our planet-positive newsletter!

More from thred.

Cancelling ‘The Late Show’ is a sign of troubling times

Are Gen Z ageing faster than their parents?

Bad Bunny redefines what it means to tour

Opinion – the earthquake magnitude scale shouldn’t be logarithmic