Menu Menu
[gtranslate]

New AI DALL-E 2 creates images from text prompts

A new AI system can create a variety of images both realistic and surreal from natural language descriptions. It has the potential to be game changing, but not without some concerns.

Algorithms and AI continue to make waves across the art industry.

The latest breakthrough comes from a research team called OpenAI, who has just unveiled a new version of its DALL-E program. This software is a text-to-image generation tool that produces artwork based on a user-inputted description.

 

View this post on Instagram

 

A post shared by DALLΒ·E by OpenAI (@openaidalle)

Aptly called DALL-E 2, this new AI won’t be available to the public, but researchers can sign up to preview its capabilities. OpenAI says it plans to make its software available for use in third-party apps eventually – though no word on when this will happen.

For now, the program will be tested by vetted partners.

Users aren’t allowed to upload generated images that may β€˜cause harm’, and must disclose what they’re using the AI for.


How does DALL-E 2 work?

While I am by no means a coding expert, I can confirm that DALL-E 2 operates using a pre-built image library. The algorithm is given a wealth of tagged pictures and then creates new artwork based on what it already knows.

Say you wanted to create an image of a tiger on a canoe. Weird, right? But DALL-E 2 will search its files to find what a β€˜canoe’ and a β€˜tiger’ both look like, and create a single piece that convincingly combines both.

 

View this post on Instagram

 

A post shared by DALLΒ·E by OpenAI (@openaidalle)

DALL-E 2 builds on the first iteration’s CLIP, a computer vision system. OpenAI says that this new software generates images using β€˜diffusion’, whereby a piece begins as a few dots and is gradually filled in with details.

This process happens via a two-stage model. CLIP first matches your text to other existing photographs and images, then a β€˜decoder’ generates the picture itself.

The above video gives a brief demonstration of what’s possible, showing off AI-generated cats, realistically edited versions of pre-existing images, and a complex system of object labelling that allows DALL-E 2 to understand your prompts. It’s truly impressive stuff.

Interestingly, OpenAI stresses that there are still errors and issues to iron out.

 

View this post on Instagram

 

A post shared by DALLΒ·E by OpenAI (@openaidalle)

Objects that are mislabelled could cause the algorithm to produce incorrect pictures that do not line up with the text description provided. If within its coding it has a pre-existing photo of a car labelled as a β€˜plane’, for example, then this could lead the generator completely off course, sending back a BMW instead of a Boeing.

In addition, very specific prompts aren’t possible until the AI has labelled and learned what the relevant objects are.

Asking for a town or rare species of animal may result in wonky, incorrect images until the algorithm has been improved. Keep in mind this is only the second iteration of DALL-E, so we’ll no doubt see even more mind-bending demos in the future.


Why could this cause issues for artists?

After perusing the artwork created by DALL-E 2, it’s hard not to feel excited by the possibilities of the technology.

We should be mindful of potential pitfalls, however. Artists already have a very difficult time earning money for their work in the internet age – hence the initial reason for NFTs – and a new algorithm-based image tool could put many small-time digital illustrators out of business.

 

View this post on Instagram

 

A post shared by DALLΒ·E by OpenAI (@openaidalle)

It’ll also become much harder to verify the authenticity of an image or painting online, and may devalue the work of genuine human beings. Instant image minting could become a possibility, creating an even more exploitative NFT market.

If everyone can make anything instantly, do illustrations and paintings lose all their commercial value? Does art itself become simply another application or tool for anyone to use?

 

View this post on Instagram

 

A post shared by DALLΒ·E by OpenAI (@openaidalle)

There are big, existential questions as to the implications of such genuinely ground-breaking software, many of which we don’t have the answers to.

To OpenAI’s credit, it seems very aware of the dangers. It says that DALL-E 2 will never be fully available to the public, and will only be slowly rolled out to trusted researchers and partners based on feedback. Users will need to say why they’re using the software and cannot make any images that are obscene or harmful.

It wants to ensure that misinformation or deep-faked images do not end up causing further havoc to our political systems and online discourse too.

 

View this post on Instagram

 

A post shared by DALLΒ·E by OpenAI (@openaidalle)

These intentions may be sound enough, but who’s to say that other, less well-meaning coders won’t simply copy OpenAI’s work? We’ve already seen one application called Wombo’s Dream launch last year, clearly based on this concept.

You can access it right now – though it is far less sophisticated than DALL-E 2.

Ultimately, we’ve no idea how this technology could impact the art world. What we do know is that things are getting scarily impressive, perhaps even a little uncanny valley. For now, OpenAI seems to be rolling out its products responsibly – and that’s the best we can hope for at this early stage.

Accessibility