We badly described cartoon characters to an AI. Here’s what it drew
OpenAi's Dall-E 2 program can create images just from worded prompts, but how well can it work from my vague descriptions?
The use of artificial intelligence (AI) has made its way into just about every industry in the world, solving complex problems, learning difficult skills in seconds, and generally blowing the minds of humans with advanced calculations we could never dream of.
But while most people are looking to use AI for the good of humanity, bettering society and solving the big questions of today, I decided to use it for a higher purpose - recreating some of the most recognisable cartoon characters through an image-generation AI.
How it works
Back in January 2021, the very intelligent people over at OpenAI created a program that, when fed a string of text, could use this to generate an image. This could be 'a lion on a sofa', ' a black hole in a box', or some other equally weird prompt. However, this technology would often produce blurry images, or struggle to fully understand the prompt that it was given. In April 2022, OpenAI launched the second iteration of this product - Dall-E 2.
Now, you are able to get high-quality images based on your worded prompts, offering highly detailed images in a matter of seconds.
The important thing to note with this technology is that, the more information you give it, the better the end result will be. However, I lack the creative skills to really paint a verbal picture... and that is pretty noticeable with my AI-generated cartoon characters. The software is able to perform at its best when it is given a style of art, and plenty of description. I instead opted for vague art terms and a lack of detail - what could go wrong!
Prompt: Horse in a suit as a oil painting
Let's be honest, based on the description 'Horse in a suit', Dall-E 2 nailed my description - that is indeed a horse in a suit, and my request of an oil painting style was nailed absolutely perfectly. I would even like to think that if there was a season where BoJack Horseman was a mid-century nobleman, my attempt at getting an AI to recreate him would be absolutely bang on.
I think if Dall-E 2 had been given more information on this one, we could have got a perfect recreation of BoJack in an oil painting!
There were two other examples of horses in suits, both also in the style of an oil painting. I think we can all agree that they are just as dapper as the AI's other attempt - who knew horses could style a suit so well?
Teenage Mutant Ninja Turtle
Prompt: Turtle wearing a black mask and holding a sword and pizza as a cartoon
It worked...! Well, it is at least recognisable this time. The prompt of 'a cartoon' did a lot of the heavy lifting here, generating a more child-like version of the original Teenage Mutant Ninja Turtles.
By giving it a bit more information than I did with the horse, Dall-E was able to create a closer attempt.
While this is by no means an exact replica, it's another impressive attempt based on a limited prompt.
Prompt: Duck with blue shirt and red bowtie as a pencil drawing
What I should have learned from the two above attempts is that the cartoon art style was the way to go here... I did not learn. This is a blessing and a curse because, while this is easily the least accurate attempt, it is also my favourite.
More like this
These are both ducks, they both have red bow-ties and blue shirts, they are both even done in different styles of drawing, but they could not look any less alike.
While I'm not sure how much extra information I could have fed (other than a hat), the end image seems like an obvious result - it is exactly what I asked for after all.
Prompt: Bear with a green hat and green tie as an animation
Okay, we've got really lost on this one. In an attempt to find a Yogi Bear doppelgänger, I seem to have ended up with a St Patrick's Day mascot.
Once again, all of the points are correct. They are both bears with green ties and hats, they just look nothing alike.
At least I was more on brand with the art style this time!
Did Dall-E 2 work?
While I think it is safe to say this was a failed mission, I don't think it has anything to do with the ability of Dall-E 2. The software has proved its impressive abilities across a host of prompts, art styles and situations, the lack of similarity comes down to a few key issues.
Firstly, I went into this thinking Dall-E would be on the same wavelength as me. As someone looking to recreate cartoon characters, the prompts are obvious to me - of course 'a duck with a blue shirt and red bowtie' is Donald Duck! But to a program that is very literal in nature, I am asking for something completely different.
Artists with a much better imagination than me have also shown that, with a more detailed prompt and understanding of the platform, you can get some much more impressive results.
Dall-E 2 also has a feature that would have given me a much more accurate result. When you insert an image, it will create hundreds of versions of it in its own style. In retrospect, a much more logical approach to this task.