DALL-E 3 renders legible text inside images and I have not found another model that does it as reliably
Specific capability post because I think this is genuinely underreported.
I do cover design work for indie authors and small publishers. A lot of cover concepts involve text as a design element, a stylized title treatment, a word integrated into an illustration, a sign in a scene that needs to be readable. Most AI image generators handle text integration badly. Letters get scrambled, words blend into the background, anything beyond two or three characters usually distorts.
DALL-E 3 is the model I use specifically for anything that requires legible text in the image. It is meaningfully better than the alternatives at rendering accurate text as part of a composition rather than as a separate overlay. For a book title integrated into a scene, a sign in a storefront illustration, a label on an object, the accuracy is high enough to be usable as a starting point rather than a frustrating failure.
The Conversational Prompting through ChatGPT is the workflow I use. I describe the composition, the style, the text elements I need and where they should sit, and then iterate through the chat interface. "Make the title text larger and move it to the upper third." "Change the background to late evening light." "The font should look hand-lettered." Each revision builds on the last without starting over.
Aspect Ratio Control is important for cover work specifically where vertical format is the default. Requesting widescreen, vertical or square adjusts automatically.
The Custom Instructions feature lets me set a persistent style baseline so I am not re-specifying lighting preferences, artistic style and camera type with every new prompt.