Skip to main content

For me it’s the first time actually doing a rapid ideation (I’ve coached students doing it before, never done it myself).
I’m excited to get started, and hope to gain insights I can possibly share with students in the future.

In the image below you’ll be able to see the flow of the rapid ideation sessions:

The theme

Through the course I am following the theme we are using has been announced, a Dixie card:

The plan

As someone who has been interested in AI image creation tools for some time, I have noticed the rapid development in this field, as evidenced by the software I use and the content I consume on social media. However, while I have been able to create beautiful things with these tools that I wouldn’t have been able to otherwise, I often feel like I lack control over the output.

For this sprint, my goal is to explore how much control I can exert over an AI image generation tool using the Dixie Card as input. I have familiarized myself with the various tools available and conducted some tests, but so far, the output still feels quite random to me. I hope to find a way to exert more control over the results.

Research

I have spent a considerable amount of time researching and experimenting with various techniques, software packages, and tools in order to compile a list of those that I want to focus on during this sprint. I have done this by using the longlist/shortlist technique and by seeking advice from artists and professionals in various Discord communities and Facebook groups.

Longlist

Craiyon

Positives
Easy to use/setup (web interface)

Negatives
Limited options, prompts did not meet expectations at all

NightCafe

Positives
Easy to use & active community

Negatives
1/4 of output is given
Expensive
Prompts don’t give the feeling of “control”

Midjourney

Positives
Easy to use & very active community
Discord bot integration

Negatives
Can be slow if used by many
Expensive
Prompts don’t give the feeling of “control”

Dall-E2

Positives
Easy to use & very active community
Web interface
Fast
Masking & regeneration options
File history

Negatives
Expensive

StableDiffusion

Positives
Easy to use & active community

Negatives
1/4 of output is given
Expensive
Prompts don’t give the feeling of “control”

AISEO

Positives
Easy to use & active community

Negatives
Prompts don’t give the feeling of “control”
No masking options, not many variations available upon generation

Photoshop integration (StableDiffusion)

Positives
Free to use, if running locally (hard to set up though)
An extra layer of control is possible by masking & regenerating parts of an image

Negatives
Very expensive to use if not running locally, and running StableDiffusion locally is quite a task

Blender integration (Midjourney or Stablediffusion)

Positives
Nice way to “block” AI art with 3D shapes
Free to use (if running StableDiffusion locally, hard though)


Negatives
Expensive to use if not running locally
Needs knowledge of Blender
Other than giving quick inputs through Blender, there is not much control afterwards except for the already integrated options for compositing in Blender

Shortlist & conclusion

After conducting small tests with various tools, I have reached the conclusion that while the blender integration with StableDiffusion is fun, it can be quite expensive to use. I have also determined that StableDiffusion, while difficult to set up, is a good open source solution for obtaining free results. I have decided to use Dall-E2 for a future sprint, and for this sprint utilize the blender integration with StableDiffusion.

In the future, I may consider diving deeper into the code and attempting to set up StableDiffusion locally. Ultimately, I have decided to continue exploring and using Dall-E2 and Midjourney in the future due to their control over output and the potential for further experimentation and learning. Right now I am excited to jump into Blender and see what StableDiffusion can do for me.

Prototype (using Blender integration)

I’ve been able to come up with some interesting results, I’ve described my design process visually here:

For this sprint’s prototype, I wanted to experiment with using the Blender integration of StableDiffusion to create AI-generated results with some level of control. Although the way I’ve been using StableDiffusion is expensive to use, I anticipate using integrations like this in the future when they become more affordable. To gain some experience with it, I used the “AI Render” feature in Blender, which is a stable diffusion implementation.

I’ve achieved some interesting results and documented my design process visually in the attached link.

 

The image above is the final result of a multi-step process using the Blender integration with the AI Render plugin. While the calculations for this process can be costly (StableDiffusion charges credits for each action), I felt that it was worth it to explore the toolset.

To provide some context, I’ll discuss 14 steps I took to come up with this end result.

1. Starting image.
2. Stable Diffusion wants to work with square images.
3. Blender scene input (3D blocking)
4. Blender scene output (input = 3)
5. Blender scene input (3D blocking)
6 . Blender scene output (input = 8)
7. Blender scene output (input = 8)
8. Blender scene input (3D blocking)
9. Blender scene output (input = 5)
10. Blender scene input (3D blocking)
11. Blender scene output (input = 8)
12. Blender scene output (input = 8)
13. Blender scene output (input = 10)
14. Blender scene output (input = 10)

To summarize, it is important to carefully consider the alignment of shapes and colors when blocking in 3D, as this can significantly affect the output of the image. This was demonstrated in the comparison of image 3 and 4, where a clear contrast between the image and the environment resulted in a more noticeable output. On the other hand, ensuring consistency in shapes and colors resulted in more expected outputs.