ANALYSIS OF IMAGE GENERATION MODELS WITH TEXTUAL ELEMENTS

Authors

DOI:

https://doi.org/10.31891/2219-9365-2025-81-18

Keywords:

image generation models, diffusion models, DALLE, Flux, RecraftV3, TextDiffuser-2

Abstract

This article addresses the problem of generating images with integrated textual content, which is a highly relevant challenge for modern artificial intelligence technologies. Despite significant advancements in image generation using diffusion models, accurately rendering text remains a challenge due to the complexity of maintaining correct character sequences and textual elements placement. The study aims to evaluate the ability of four modern models (DALL-E, Flux, RecraftV3, and TextDiffuser-2) to generate high-quality text with varying input lengths and identify critical points at which the quality of textual elements on generated images significantly deteriorates.

For the experimental part, a set of text prompts was created, ranging from 1 to 15 words, including simple words, short phrases, and more complex sentences. Each prompt was processed ten times by each model, providing a representative sample of results. The analysis of the generated images allowed for identifying critical points—the text lengths at which the models fail to produce correct text—and classifying typical errors in the generated images.

The results indicate significant differences between the models: RecraftV3 demonstrated the highest stability, maintaining text accuracy up to 14 words, while DALL-E-3 and Flux-1-Pro showed quality degradation after 5 words. TextDiffuser-2 exhibited a high error rate, limiting its use in tasks where accuracy is critical. The study’s findings have practical value for further improving image generation algorithms, particularly in advertising, design, and automated visual content creation.

Published

2025-03-10

How to Cite

SHAPTALA Р., & YAKOVENKO Я. (2025). ANALYSIS OF IMAGE GENERATION MODELS WITH TEXTUAL ELEMENTS. MEASURING AND COMPUTING DEVICES IN TECHNOLOGICAL PROCESSES, (1), 152–159. https://doi.org/10.31891/2219-9365-2025-81-18