ARTICLE
27 January 2023

Navigating The Future Of AI Training: Addressing The Data Shortage Challenge

MC
Marks & Clerk

Contributor

Marks & Clerk is one of the UK’s foremost firms of Patent and Trade Mark Attorneys. Our attorneys and solicitors are wired directly into the UK’s leading business and innovation economies. Alongside this we have offices in 9 international locations covering the EU, Canada and Asia, meaning we offer clients the best possible service locally, nationally and internationally.
We can all agree that training data is essential to any machine learning model, but what happens when that data runs out? According to a recent article...
United Kingdom Technology

We can all agree that training data is essential to any machine learning model, but what happens when that data runs out? According to a recent article in The New Scientist, the high-quality language data used to train models such as ChatGPT could run out as soon as 2026.

High-quality language data includes books and scientific papers but is slow and costly to generate. Lower-quality data includes posts on blogs, forums and social media and is plentiful, but machine learning models based on lower-quality data may struggle to make the paradigm-shifting developments seen in machine leaning models recently. Not only is this data shortage likely to slow development, it could also see the cost of training data rocket.

But all is not lost. While these predictions are based on human-created data, synthetic data can also be generated leading to a potentially infinite source. The effectiveness of synthetic data for training machine learning models must be evaluated, but it certainly provides new opportunities for training. Also, more efficient learning algorithms are being developed all the time which can enable models to extract more knowledge from existing data sets, learn from smaller data sets and even transfer learning from one task to another.

I look forward to reading about innovations in these areas over the coming years and I am sure we will continue to see huge leaps in AI development into 2026 and beyond.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

Mondaq uses cookies on this website. By using our website you agree to our use of cookies as set out in our Privacy Policy.

Learn More