Mr Blobby on Love Island, teapots with six-packs, Guy Fieri drowning in a vat of soup: AI-rendered images from the depths of our (cursed) imaginations have made us belly laugh ever since the tools came into our hands. Many of these images look janky, with blurring and odd textures that let us know there’s no way they could ever be real.
Machine learning, however, is advancing faster than ever—and programs like DALL-E 2, Stable Diffusion, and Midjourney are now able to quickly churn out photorealistic images in a matter of seconds. A series of striking polaroids featuring black female goth and heavy metal fans in the late 90s recently went viral on Twitter, showing groups of women in distinctive leather outfits, band tees, and rock hairstyles. It’s hard to believe they’re not actual pictures unearthed from sweaty gigs of the past, but are instead the work of a Midjourney user called Fallon Fox.
If you look closely, it’s quite easy to pinpoint the uncanny valley phenomenon: the hands—a tell-tale sign of machine learning-generated images—often look wrong, sometimes with six-plus fingers gripping drinks weirdly. Some of the faces don’t look quite right either, with out-of-place noses and teeth. But there’s an overall ambience to the shots that feels very real. Nostalgic, even.
Other examples of this hyper-realistic photography being generated of the ‘past’ are surfacing as more users play around with the technology. There are pictures from inner-city Glasgow in 1985 with trackie-clad lads standing in front of tower blocks, and 90s rival gangs from LA playing dice games in colour-coordinated fits. Another user produced a depiction of queer culture in the past. None of them look perfect, but when viewed as thumbnails or perceived without a second thought, your eyes could easily gloss over the flaws.
“It is tricky [to differentiate human-shot photography from that created by AI], particularly when those photographs appear in contexts like social media, where our engagement with each image tends to be very brief,” Lewis Bush, a photographer and PhD researcher in Computational Photojournalism at the London School of Economics and Political Science, told SCREENSHOT. “There are still clues in many of these generated images as to their origins, but they tend to be hard to pick up on without a concentrated look at the image.”
It seems like we’re going to be inundated with hyperreal AI-generated photography pretty soon too. Stock image company Shutterstock has just inked a deal to add images created by DALL-E 2. At the same time, in a bid to reassure aggrieved photographers, it’s offering reimbursements to those whose images the AI has been trained on.
If the internet gets saturated with these kinds of photographs, which sit alongside real historical documentation, are we in danger of creating false narratives about the past? It’s worth noting that we’ve been able to edit and doctor images for years, dating back to the pre-Photoshop retouching used by the Soviet Union, and image manipulation is nothing new—AI isn’t doing anything that an advanced digital creator couldn’t accomplish.
That being said, these tools are increasingly being used by (almost) the entire world rather than a comparatively tiny group of people. So, should we be worried about machine-generated imagery becoming more advanced? Jon McCormack, a generative art researcher at Monash University, Australia, doesn’t think so. What he is concerned about however is “the overall aesthetic and ‘sameness’ of images generated using diffusion models.”
“Because they are trained on images from the internet, they mimic popular styles and aesthetic values, diminishing representations that aren’t as culturally homogeneous or ubiquitous,” McCormack shared. “Such machine learning systems just reinforce this ubiquity and sameness, ultimately at the expense of cultural diversity and creativity.”
TikTok user and digital artist @eli_other, who uses DALL-E 2 to generate images using text prompts like “a LEGO set coming home to his wife cheating on him” and “album cover for an all-horse metal band,” says he’s found that images generated by machines skew towards white male-centric themes. “For many of my prompts, I use neutral language like ‘person’ and about 60 per cent of the time the ‘person’ is a white man. Alongside that, it’s about 20 per cent white women and 20 per cent people of colour of any gender.”
Reactions to the Black goth and heavy metal fans on Twitter ranged from “perfect” and “beautiful” to “icky.” Non-white fans of that type of music did, of course, exist in the 90s, but the goth subculture has historically been associated with whiteness, and there’s a question about whether misrepresentations of the past could lead to skewed narratives about historic oppression. Fallon Fox, the former MMA fighter who’d used Midjourney to produce the polaroids, told Screen Rant that she was trying to “show a representation of people like [herself],” a Black trans woman, in the metal scene. “I put a lot of references to 90s-era Black goths in there,” she added.
With most innovations comes some kind of pushback and, just like any other picture, Bush believes that it’s “important to understand what it is that you’re looking at, and how that image has been produced.” And there are endless benign and educational purposes for this kind of photo generation. One of Bush’s projects, titled An Antique Land zine, uses machine learning to picture London after its fall to the ravages of climate change with green foliage sprouting out of the Natural History Museum.
In 2021, AI research laboratory OpenAI invented DALL·E, a neural network trained to generate images from text prompts. With just a few descriptive words, the system (named after both surrealist painter Salvador Dalí and the adorable Pixar robot WALL-E) can conjure up absolutely anything from an armchair shaped like an avocado to an illustration of a baby radish walking a dog in a tutu. At the time, however, the images were often grainy, inaccurate and time-consuming to generate—leading the laboratory to upgrade the software and design DALL·E 2. The new and improved model, supposedly.
While DALL·E 2 is slowly being rolled out to the public via a waitlist, AI artist and programmer Boris Dayma has launched a stripped-down version of the neural network which can be used by absolutely anyone with an internet connection. Dubbed DALL·E mini, the AI model is now all the rage on Twitter as users are scrambling to generate nightmarish creations including MRI images of Darth Vader, Pikachu that looks like a pug and even the Demogorgon from Stranger Things as a cast member on the hit TV show Friends.
While the viral tool has even spearheaded a meme format of its own, concerns arise when text prompts descend beyond innocent Pikachus and Fisher Price crack pipes onto actual human faces. Now, there are some insidiously dangerous risks in this case. As pointed out by Vox, people could leverage this type of AI to make everything from deepnudes to political deepfakes—although the results would be horrific, to say the least. Given how the technology is free to use on the internet, it also harbours the potential to put human illustrators out of work in the long run.
But another pressing issue at hand is that it can also reinforce harmful stereotypes and ultimately accentuate some of our current societal problems. To date, almost all machine learning systems, including DALL·E mini’s distant ancestors, have exhibited bias against women and people of colour. So, does the AI-powered text-to-image generator in question suffer the same ethical gamble that experts have been warning about for years now?
Using a series of general prompts, SCREENSHOT tested the viral AI generator for its stance on the much-debated racism and sexism that the technology has been linked to. The results were both strange and disappointing, yet unsurprising.
When DALL·E mini was fed with the text prompts ‘CEO’ and ‘lawyers’, the results were prominently white men. A query for ‘doctor’ reverted back with similar results while the term ‘nurse’ featured mostly white women. The same was the case with ‘flight attendant’ and ‘personal assistant’—both made assumptions about what the perfect candidate for the respective job titles would look like.
Now comes the even more concerning part, when the AI model was prompted with phrases like ‘smart girl’, ‘kind boy’ and ‘good person’, it spun up a grid of nine images all prominently featuring white people. To reiterate: Are we shocked? Not in the least. Disappointed? More than my Asian parents after an entrance exam.
In the case of DALL·E 2, AI researchers have found that the neural network’s depictions of people can be too biassed for public consumption. “Early tests by red team members and OpenAI have shown that DALL·E 2 leans toward generating images of white men by default, overly sexualizes images of women, and reinforces racial stereotypes,” WIRED noted. After conversations with roughly half of the red team—a group of external experts who look for ways things can go wrong before the product’s broader distribution—the publication found that a number of them recommended OpenAI to release DALL·E 2 without the ability to generate faces.
“One red team member told WIRED that eight out of eight attempts to generate images with words like ‘a man sitting in a prison cell’ or ‘a photo of an angry man’ returned images of men of colour,” the publication went on to note.
When it comes to DALL·E mini, however, Dayma has already confronted the AI’s relationship with the darkest prejudices of humanity. “While the capabilities of image generation models are impressive, they may also reinforce or exacerbate societal biases,” the website reads. “While the extent and nature of the biases of the DALL·E mini model have yet to be fully documented, given the fact that the model was trained on unfiltered data from the Internet, it may generate images that contain stereotypes against minority groups. Work to analyze the nature and extent of these limitations is ongoing, and will be documented in more detail in the DALL·E mini model card.”
Although the creator seems to have somewhat addressed the bias, the possibility of options for either controlling harmful prompts or reporting certain results cannot be ruled out. And even if they’re all figured out for DALL·E mini, it’ll only be a matter of time before the neural system is replaced by another with impressive capabilities where such an epidemic of bias could resurface.