Epoch AI


They estimated that computer scientists could deplete high-quality language data stock by as early as this year, low-quality language data within two decades, and run out of image data stock between the late 2030s and the mid-2040s.

While, theoretically, synthetic data generated by AI models themselves could be used to refill drained data pools, that’s not ideal as it’s been shown to lead to model collapse. 

Research has also shown that generative imaging models trained solely on synthetic data exhibit a significant drop in output quality.

Comments

Popular posts from this blog

Perplexity

Aphorisms: AI

DeepAI's Austen on China