Tags: ai
<< previousnext >>The comedian George Carlin had a bit known as Things You Never See, where he listed off a series of things you never see or hear (or want to hear) in day-to-day life. Things like a wheelchair with a roll cage, or someone telling their dad that he should drink more. It occurred to me that these would serve as amusing prompts for AI image generation, while giving me an opportunity to play around with the latest trendy AI tools. Continue reading to see the results!
WARNING: The prompts and resulting images are NOT SUITABLE FOR WORK. After all, George Carlin's comedy was so controversial that it sparked a court case about the US government's right to censorship.
I first looked into DALL-E and Bing Image Creator, both of which offer web interfaces for inputting prompts and getting back AI-generated images. DALL-E turned out to be paywalled, however, while Bing Image Creator censors any "indecent material" in the prompts. So, defeated by The Man, I turned to Stable Diffusion (SD), an open source image-generating AI that can be run by anyone with a GPU and patience.
I had to jump between various guides to get SD to work. I'm not going to link a particular guide, because none of them contained all the information I needed. The steps can be summarised as follows:
scripts/txt2img.py
from the repository.Another roadblock I hit was that my GPU (GeForce GTX 1050 Ti) has only 4GB of memory, while 6GB is normally required to run SD. Thankfully, a fork exists to run SD with lower memory requirements. It provides a version of txt2img.py
that breaks the GPU computations into multiple stages, allowing you to get away with around 2.5GB of GPU memory. In exchange, you have to wait longer for results, and it took my computer over 1 minute to generate each image.
Carlin's jokes and the corresponding images are shown below. The jokes are in italics, while my comments are not. I edited the text of each joke to make it more prompt-like before passing it to SD. For "things you never hear", I generally passed the dialogue by itself, quote marks included. All the images were originally generated at a resolution of 512x512, since the output was terrible when I tried 256x256. Finally, a suggestion: the images may look better if you squint.
You never see a Rolls-Royce with a bumper sticker that says "shit happens". It looks like a fancy car, but no bumper sticker.
You never see a really big tall fat Chinese guy with red hair. Those symbols resemble Chinese characters, right? But he doesn't have red hair.
You never see a wheelchair with a roll bar. No roll bar!
You never see someone taking a shit while running at full speed. I used a more explicit prompt to generate the second image, since I thought the first one didn't live up to its full comedic potential. Note how the floor got turned into poop. I kindly censored the naked runner's weird smooth crotch.
You never see a picture of Margaret Thatcher strapping on a dildo. I generated a second one because I thought it was funny. Hopefully it's okay to do this, given that she's dead? He told the joke while she was alive, so...
You never hear someone say "Dad, you really ought to drink more". The first of the lame "you never hear..." prompts, although it's funny to imagine the baby whispering that.
You never hear someone say "Do what you want to the girl, but leave me alone". It seems that SD didn't understand the context of this prompt.
You never hear someone say "As soon as I put this hot poker in my ass I'm going to chop my dick off". The guy looks appropriately serious.
You never hear someone say "Honey, let's sell the children, move to Zanzibar, and begin taking opium rectally". Yet more demon faces.
You never hear someone say "Mom, I've got a big date tonight, can I borrow a French tickler from you?". The disembodied hand is quite funny.
You don't want to come home and hear "Honey, remember how we told the children never to play on the railroad tracks?". AI-generated children are creepy, apparently.
You don't want to be sitting in your doctor's office and hear "Well Jim, there's no reason why you shouldn't live another 20 to 30 years. However, you will be bleeding constantly from both eyes". I could probably pick a better prompt to show someone bleeding from their eyes, but I don't particularly feel like doing that.
You don't want to hear "I'M PREGNANT, YOU'RE THE FATHER, AND I'M GOING TO KILL ALL 3 OF US!" Liking the intensity of this one.
You don't want to hear "Honey, it's the police. They have a search warrant and the 300 kilos of cocaine are still sitting out in the living room." This is quite good, actually.
You don't want to hear your fiancé say "I'll be right back, I've gotta take a dump" while the two of you are having dinner together with your parents for the first time. And I'm not sure why a Willy Wonka meme is supposed to be relevant to this one.
Playing around with SD has made me appreciate the abilities and limitations of current image AIs. As far as I can tell, it takes a lot of thought and a lot of hardware power to get satisfactory results. Regarding the George Carlin prompts, I don't think they were explicit enough to generate the images I was hoping for, particularly the ones in the "things you don't hear" category. SD doesn't seem to be smart enough to pick up on the context of dialogue, and requires a direct description. Also, for some reason, putting the prompt in quotes tends to result in the addition of meme text.
That's all for now! If I were to continue experimenting with SD, then I would try out the many tools and interfaces that people have built around it to help generate better images. There's also the option to base the output on a source image.
I'd be happy to hear from you at galligankevinp@gmail.com.