Navigating The Future Of AI Training: Addressing The Data Shortage Challenge

Marks & Clerk is one of the UK’s foremost firms of Patent and Trade Mark Attorneys. Our attorneys and solicitors are wired directly into the UK’s leading business and innovation economies. Alongside this we have offices in 9 international locations covering the EU, Canada and Asia, meaning we offer clients the best possible service locally, nationally and internationally.

Explore Firm Details

We can all agree that training data is essential to any machine learning model, but what happens when that data runs out? According to a recent article...

United Kingdom Technology

Article Insights

Marks & Clerk are most popular:

within Immigration topic(s)

We can all agree that training data is essential to any machine learning model, but what happens when that data runs out? According to a recent article in The New Scientist, the high-quality language data used to train models such as ChatGPT could run out as soon as 2026.

High-quality language data includes books and scientific papers but is slow and costly to generate. Lower-quality data includes posts on blogs, forums and social media and is plentiful, but machine learning models based on lower-quality data may struggle to make the paradigm-shifting developments seen in machine leaning models recently. Not only is this data shortage likely to slow development, it could also see the cost of training data rocket.

But all is not lost. While these predictions are based on human-created data, synthetic data can also be generated leading to a potentially infinite source. The effectiveness of synthetic data for training machine learning models must be evaluated, but it certainly provides new opportunities for training. Also, more efficient learning algorithms are being developed all the time which can enable models to extract more knowledge from existing data sets, learn from smaller data sets and even transfer learning from one task to another.

I look forward to reading about innovations in these areas over the coming years and I am sure we will continue to see huge leaps in AI development into 2026 and beyond.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

Navigating The Future Of AI Training: Addressing The Data Shortage Challenge

Contributor

Technology

Contributor

United Kingdom