I wanted to highlight a feature available in Stable Diffusion (as well as perhaps other generative AIs) – regional prompting. Midjourney looks like one can do it manually; I don’t know about others.
It can be difficult, when generating an image, to specify that it contains multiple people with distinct attributes, because the prompt applies to the whole image. Specify a name, and the prompt affects both people. Making one person is easy, and making twins is easy…but making two distinct people isn’t.
Thelsim ran into this in a comment yesterday, where the face of a scared woman was placed on the head of a giant robot.
There are two ways I know of to deal with this. One is inpainting. This gives a great deal of control, but is time-consuming. If one inpaints, one generates an image, then erases part of the image, and then regenerates just that part with a new (potentially different) prompt. The problem with this is that it takes a while, and makes it hard to go back and change something in the original image.
The second approach is to use regional prompting. This permits the image to be divided up, and different prompts applied to only portions of the image.
I’m going to walk through the regional prompting process here, in case anyone else hasn’t done it before. For Stable Diffusion, you’ll want to install the Regional Prompter extension, in the Extensions tab.
Okay, let’s take our first stab at the image:
ursula von der leyen and angela merkel sitting side-by-side, by neal adams
Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 11, Size: 1024x1024, Model hash: ebf42d1fae, Model: realmixXL_v15, Token merging ratio: 0.5, Version: v1.5.1
Well, that didn’t work very well. On both women, we’ve got Angela Merkel’s face combined with something like a cross of von der Leyen’s and Merkel’s hair.
Activate the regional prompter, down at the bottom of the txt2img tab. Check “Use common prompt”. click “Visualize and make template”. This will show you how you have the image set up to be divided up – currently, two equal-size boxes split vertically down the image. It’ll also give you some text to paste into the prompt box:
ADDCOMM
ADDCOL
The text before “ADDCOMM” is the common prompt, that applies to the whole image. The text between ADDCOMM and ADDCOL apply to the left side of the image. The text after ADDCOL applies to the right side of the image.
ursula von der leyen and angela merkel sitting side-by-side, by neal adams ADDCOMM ursula von der leyen ADDCOL angela merkel
Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 11, Size: 1024x1024, Model hash: ebf42d1fae, Model: realmixXL_v15, Token merging ratio: 0.5, RP Active: True, RP Divide mode: Matrix, RP Matrix submode: Horizontal, RP Mask submode: Mask, RP Prompt submode: Prompt, RP Calc Mode: Attention, RP Ratios: “1,1”, RP Base Ratios: 0.2, RP Use Base: False, RP Use Common: True, RP Use Ncommon: False, RP Change AND: False, RP LoRA Neg Te Ratios: 0, RP LoRA Neg U Ratios: 0, RP threshold: 0.4, RP LoRA Stop Step: 0, RP LoRA Hires Stop Step: 0, Version: v1.5.1
Doesn’t look much different; the right side and the left side each favor one woman a bit, but not by much. Let’s let it start with a scene that has any two women sitting side-by-side – no reason to start with just scenes that have von der Leyen and Merkel already being the people there, and by totally removing Merkel and von der Leyen from the common prompt, that’ll prevent Merkel from influencing the left side or von der Leyen the right.
two women sitting side-by-side, by neal adams ADDCOMM ursula von der leyen ADDCOL angela merkel
Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 11, Size: 1024x1024, Model hash: ebf42d1fae, Model: realmixXL_v15, Token merging ratio: 0.5, RP Active: True, RP Divide mode: Matrix, RP Matrix submode: Horizontal, RP Mask submode: Mask, RP Prompt submode: Prompt, RP Calc Mode: Attention, RP Ratios: “1,1”, RP Base Ratios: 0.2, RP Use Base: False, RP Use Common: True, RP Use Ncommon: False, RP Change AND: False, RP LoRA Neg Te Ratios: 0, RP LoRA Neg U Ratios: 0, RP threshold: 0.4, RP LoRA Stop Step: 0, RP LoRA Hires Stop Step: 0, Version: v1.5.1
Okay, that’s not bad, and now the Neal Adams hand-drawn art style is coming through, but the two are looking suspiciously well-endowed in the breast department, probably because of the model I’m using that’s being influenced by images of other women that it’s been trained on. Let’s turn down the relative strength of the generic “two women” bit. You can do that by putting a term in parentheses, putting a colon and then a strength modifier after it. An unmodified term has a strength of 1, so I’ll use .1, only a tenth the strength:
(two women sitting side-by-side:.1), by neal adams ADDCOMM ursula von der leyen ADDCOL angela merkel
Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 11, Size: 1024x1024, Model hash: ebf42d1fae, Model: realmixXL_v15, Token merging ratio: 0.5, RP Active: True, RP Divide mode: Matrix, RP Matrix submode: Horizontal, RP Mask submode: Mask, RP Prompt submode: Prompt, RP Calc Mode: Attention, RP Ratios: “1,1”, RP Base Ratios: 0.2, RP Use Base: False, RP Use Common: True, RP Use Ncommon: False, RP Change AND: False, RP LoRA Neg Te Ratios: 0, RP LoRA Neg U Ratios: 0, RP threshold: 0.4, RP LoRA Stop Step: 0, RP LoRA Hires Stop Step: 0, Version: v1.5.1
And there we have our final image!
Thank you for these explanations