Friday, May 8, 2026

"Looks Like a Real Couple at the Ballpark" ... AI Images Impress with Their Realism

Input
2026-05-05 18:34:17
Updated
2026-05-05 18:34:17
The image was generated after the reporter entered the same prompt into ChatGPT Images 2.0 (top) and Nano Banana 2 (bottom): "a realistic scene in which a young couple is spotted in the stands during a baseball broadcast." Photo by Ju Won-gyu.
Competition among global tech giants in generative artificial intelligence is now extending to image generation as well. As models continue to improve, small differences in realism and usability are emerging as key factors in market evaluation.
On the 5th, the reporter compared OpenAI's ChatGPT Images 2.0 with Google Gemini's image generation and editing model, Nano Banana 2. Nano Banana had once been seen as far ahead of ChatGPT's image model, but OpenAI's newly refined ChatGPT Images 2.0 produced more realistic results when given the same prompt.
■ ChatGPT Images 2.0: Even the noise is reproduced
When the reporter entered the prompt, "a realistic scene in which a young couple is spotted in the stands during a baseball broadcast," ChatGPT Images 2.0 accurately reproduced distinctive broadcast elements such as the actual camera angle, lighting, screen noise, and visual detail.
The result looked striking, as if the couple were really watching baseball from the stands. Nano Banana 2, by contrast, felt somewhat awkward. The couple in the crowd looked realistic, but they were facing the stands rather than the field, which made the image feel unnatural. Similar results were seen with more complex prompts as well.
The reporter entered a detailed and difficult prompt: "A brown rabbit in a spacesuit eating ramen while sitting on a wet street in a neon-lit cyberpunk city. Next to the rabbit is a small neon sign that reads 'Seoul 7578 Daegu,' and in the background is a flying car rendered with a shallow depth of field." Both models reflected all of the user's requirements, but ChatGPT Images 2.0 was stronger in specific areas such as the rabbit's depiction, depth-of-field expression, and color consistency. In particular, it showed a major improvement in accuracy compared with earlier models, which struggled to render text in non-Latin scripts such as Korean.
■ ChatGPT Images wins on UI/UX too
ChatGPT also outperformed in terms of post-generation usability. After clicking on a generated image, users can type desired edits in natural language and have them applied immediately. The intuitive feature makes it easy even for ordinary users who are not familiar with complex graphic tools to refine detailed images, greatly improving practicality.
OpenAI previously said that, as of the 26th of last month, the first week after launch, daily active users (DAU) of ChatGPT Images 2.0 had risen by more than 60% from the previous week, while new user inflows had surged by more than 130%.
Meanwhile, as generative AI begins to be used meaningfully in real-world work, model performance is becoming more sophisticated and advancing rapidly. Earlier this year, Anthropic's Opus 4.7 drew attention for its overwhelming performance in coding and workplace applications. OpenAI recently responded with GPT-5.5, which has created a two-horse race, and Google Gemini is also said to be preparing a new model.
wongood@fnnews.com Ju Won-gyu Reporter