Researchers explored how pre-training data size and task alignment impact the performance of large language models and even proposed a new scaling law to predict it.

via Hugging Face