
The European Broadcasting Union (EBU) and the BBC in the UK jointly released the latest international research indicating that nearly half (45%) of the content from mainstream AI assistants contains significant errors when responding to news-related questions, and up to 81% of the answers have varying degrees of issues.
Mainstream models are the subjects of research, focusing on three major aspects.
This research covers 14 languages and 3,000 responses from AI assistants to news-related questions. The subjects include various mainstream AI assistants, such as ChatGPT, Copilot, Gemini, and Perplexity. The research team examined three major aspects of each response:
Content accuracy.
Whether the source attribution is correct.
Whether one can distinguish between 'facts' and 'opinions.'
Nearly half of AI responses contained errors, with Gemini having the highest error rate.
The results show that AI assistants had 45% of their responses contain obvious errors when answering news questions, such as providing misleading information, citing incorrect content, or using outdated data. Overall, as much as 81% of responses had some issues, just varying in severity.
Among them, about 30% of responses had errors in 'source attribution,' possibly due to not indicating the source, quoting inaccurate data, or marking the wrong source. Among all tested AI assistants, Google's Gemini performed the worst, with 72% of responses having significant source issues, far exceeding other assistants (most below 25%).
Additionally, about 20% of responses had errors in 'content accuracy,' with common issues being outdated or incorrect information used in responses.
Gemini misreported regulations, and ChatGPT incorrectly stated that the Pope is still alive.
The research cited several specific examples:
Gemini once misreported the content of the 'one-time electronic cigarette regulation' amendment.
ChatGPT still answered 'Pope Francis is alive' during testing, despite having passed away months ago.
It is clear that AI models still face delays and insufficient data sources when handling current news.
In response, Google has stated on its official website that it welcomes user feedback to continuously improve platform quality. OpenAI and Microsoft have both previously acknowledged that the issue of 'AI hallucination' remains to be resolved, due to insufficient data and model judgment errors. Perplexity claims that its 'deep search mode' can achieve a factual accuracy rate of 93.9%.
EBU warns: the trust crisis may affect democratic participation.
EBU indicates that as AI assistants gradually replace search engines as news sources, if the public cannot discern the truth from false information, they may ultimately choose to "no longer believe anything," which could weaken democratic participation.
EBU also calls for AI companies to be included in the 'news responsibility system' to ensure that when dealing with news-related issues, they can provide verifiable sources, accurate facts, and the ability to clearly distinguish between commentary and fact.
This article's AI news error rate is nearly 50%: Gemini made the most errors, and ChatGPT mistakenly reported the Pope as still alive. This first appeared in Chain News ABMedia.

