Simon Willison summarizes Andrej Karpathy's review of Meta's Llama 3, noting the increase in training tokens and tokenizer size but also the disappointingly small context length.

via @simonw