Insights From The Blog

Midjourney Can Now Create Consistent Characters

One of the major complaints about the Midjourney program is its inability to reproduce the same character again and again in different environments and locations. Up until now, its output has been a little random and inconsistent, but all that is likely to change with a change to the base program.

Random Outputs

By their very nature, AI image generators rely on random input in the form of specific keywords to create an image that the AI program believes the user wants. These intelligent programs operate on the basis of diffusion models to sift and sort huge amounts of data to create the image. Diffusion Models are essentially a generative system that acquire knowledge from the data during the training process and then produce comparable instances based on the acquired knowledge. 

These models are influenced by the databases that they access and have attained exceptional quality in generating several types of image data. However, because they are accessing information from huge amounts of data, the images that they produce are nearly always one-offs that are based on their interpretation of the input.  Inevitably, this means that there is no consistency in the output, and every Midjourney creation is usually different from others – even if based on the same keywords.

In most cases, this isn’t a problem as many users just want random images based on their keywords, but what about if you are story-boarding for a film or generating a graphic novel and want the same character in a range of different scenarios. Up until now, that meant hiring a graphic artist to do the work by hand, but an update to Midjourney is now much closer to doing just that.

Gaining Consistency

The latest iteration of Midjourney could be close to reaching that standard though, with the introduction of a new character reference tag, or “cref”.  Using this tool at the end of the input strings will force the Midjourney program to match a character’s physical traits across multiple images, and in most cases, this means that you get the same character with a range of expressions in different poses. In most cases, the poses and background can be altered for each image by introducing specific tags.

Make no mistake; this is a big deal indeed for content producers looking for consistency in AI output. Up until now, creating the same character in a number of different situations or with different backgrounds was pretty much impossible with AI because there was no real way to force the App to retain certain features (I.E. the character) but insert different backgrounds. With this new possibility Midjourney will be able to produce the same character in different environments, which will drastically cut down on production time.

The Way Forward

The Midjourney team were excited when they developed this new feature, but needed to check how consistent it was, using a swage of testing to find the limits of the application. Now, having run hundreds of hours of checking, the team have found that the cref tag works best with images and characters that have been previously developed by the program, rather than trying to create many related images of a new character. To do this, the Midjourney app requires a certain process to achieve the best results.  

  • Generate a new character with the attributes that you want.
  • Select the image that best fits your needs and control-click it in the Midjourney Discord server to find the “copy link” option.
  • Enter the URL of the just created picture and type a new prompt, such as “standing on Death Star” or “breaking out of prison” –cref [URL]; Midjourney will try to recreate the identical character from before in the freshly typed context.  

Behold, the same character appears in the scenes that you have specified. The cref feature is a huge leap forward, but is also adaptable in its intensity (or character ‘weight’), which results in how closely the resulting image matches the original character. This is done by adding the extra command of ‘cw’ followed by a number between 1 and 100. The actual command would read something like –cref [URL] –cw 100, with the number representing how closely the image will match the original. A cw of 100 would mean a perfect match, while a cw of 1 would mean that you wanted a huge variance in the character.

This mod is still only in its alpha state but is already showing excellent results. As the Midjourney team continues to develop the App, the overall results will improve drastically. It all has a bright future, unless you happen to be a graphic designer, of course.