Is ChatGPT actually getting worse?

Alex Sword


The Financial Services Forum

Those concerned about Terminator-style AI taking over the world can breathe a temporary sigh of relief, after a new research paper found that ChatGPT’s performance at certain tasks had actually got worse over time.

Produced by academics at the American universities of Stanford and Berkeley, the paper evaluated how ChatGPT-3.5 and ChatGPT-4’s performance at a series of set tasks had changed between March and June.

The varied list of tasks including solving maths problems and generating code, as well as answering opinion survey questions and multi-hop questions, the latter of which depend on drawing information from different parts of the text.

The researchers found that GPT-4 was able to identify prime numbers with 84% accuracy in March, but by June, this had fallen to 51%. Both models performed worse at formatting code in June than in March.

The research found that GPT-4’s accuracy in answering a medical examination had also fallen from 86.6% to 82.1%.

However, some functions had improved, with visual reasoning scores improving on both models, while GPT-3.5’s maths scores improved overall.

The researchers said that due to these rapid changes in performance, the large language models needed to continually evaluated. The paper noted that “it is currently opaque when and how GPT-3.5 and GPT-4 are updated, and it is unclear how each update affects the behavior of these LLMs.”

The theory of large language models is that over time, the input and feedback from users will train them to get better and better at performing tasks.

This is probably a temporary obstacle to generative AI’s inevitable rise, but it does show that it still has some way to go.

Previous article

HSBC hires tech specialist to head commercial marketing

Next article

Q&A: Tide's Engagement Marketing Director on connecting with its members through data

Get access to valuable thought leadership from the financial services marketing industry

Keep up-to-date with current trends and changes across marketing and financial services is vital in this fast-moving business environment.