Llama 3 in [8B and 70B] sizes is out

RMAG news

Llama 3 has just been rolled-out, exactly 9 month after the release of Llama 2. It is already available for chat at Meta web site, can be downloaded from Huggingface in safetensors or GGUF format.

While the previous generation has been trained on a dataset of 2 trillion tokens the new one utilised 15 trillion tokens.

What is fascinating is how the smaller 8B version outperformed the bigger previus-gen 70B model in every benchmark listed on the model card:

Benchmark
Llama 3 8B
Llama 2 7B
Llama 2 13B
Llama 3 70B
Llama 2 70B

GPQA (0-shot)
34.2
21.7
22.3
39.5
21.0

HumanEval (0-shot)
62.2
7.9
14.0
81.7
25.6

GSM-8K (8-shot, CoT)
79.6
25.7
77.4
93.0
57.5

MATH (4-shot, CoT)
30.0
3.8
6.7
50.4
11.6

Llama 3 has also upped the context window size from 4k to 8k tokens.

Leave a Reply

Your email address will not be published. Required fields are marked *