Download Llama Models — Complete Guide (Llama 3.2, 3.1, 3, 2 and llama-Guard).
This post goes into the details of the way to download the Llama models.
Go to the link — Download models
Fill up the model information to be requested — read the license terms.
Requested models:
- Llama 3.2 1B
- Llama 3.2 3B
- Llama Guard 3 1B
- Llama Guard 3 1B INT4
- Llama 3.2 11B
- Llama 3.2 90B
- Llama Guard 3 11B Vision
- Llama 3.1 8B
- Llama 3.1 70B
- Llama 3.1 405B
- Llama Guard 3 8B
- Prompt Guard
The models listed below are now available to you as a commercial license holder. By downloading a model, you are agreeing to the terms and conditions of the License, Acceptable Use Policy and Meta’s privacy policy.
How to download the model
Visit the Llama repository in GitHub where instructions can be found in the Llama README
Install the Llama CLI
pip install llama-stack
Find models list
llama model list
llama model list --show-all
After running the command — Output will look something like this.
+----------------------------------+------------------------------------------+----------------+
| Model Descriptor | HuggingFace Repo | Context Length |
+----------------------------------+------------------------------------------+----------------+
| Llama-2-7b | meta-llama/Llama-2-7b | 4K |
+----------------------------------+------------------------------------------+----------------+
| Llama-2-13b | meta-llama/Llama-2-13b | 4K |
+----------------------------------+------------------------------------------+----------------+
| Llama-2-70b | meta-llama/Llama-2-70b | 4K |
+----------------------------------+------------------------------------------+----------------+
| Llama-2-7b-chat | meta-llama/Llama-2-7b-chat | 4K |
+----------------------------------+------------------------------------------+----------------+
| Llama-2-13b-chat | meta-llama/Llama-2-13b-chat | 4K |
+----------------------------------+------------------------------------------+----------------+
| Llama-2-70b-chat | meta-llama/Llama-2-70b-chat | 4K |
+----------------------------------+------------------------------------------+----------------+
| Llama-3-8B | meta-llama/Llama-3-8B | 8K |
+----------------------------------+------------------------------------------+----------------+
| Llama-3-70B | meta-llama/Llama-3-70B | 8K |
+----------------------------------+------------------------------------------+----------------+
| Llama-3-8B-Instruct | meta-llama/Llama-3-8B-Instruct | 8K |
+----------------------------------+------------------------------------------+----------------+
| Llama-3-70B-Instruct | meta-llama/Llama-3-70B-Instruct | 8K |
+----------------------------------+------------------------------------------+----------------+
| Llama3.1-8B | meta-llama/Llama-3.1-8B | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama3.1-70B | meta-llama/Llama-3.1-70B | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama3.1-405B:bf16-mp8 | meta-llama/Llama-3.1-405B | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama3.1-405B | meta-llama/Llama-3.1-405B-FP8 | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama3.1-405B:bf16-mp16 | meta-llama/Llama-3.1-405B | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama3.1-8B-Instruct | meta-llama/Llama-3.1-8B-Instruct | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama3.1-70B-Instruct | meta-llama/Llama-3.1-70B-Instruct | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama3.1-405B-Instruct:bf16-mp8 | meta-llama/Llama-3.1-405B-Instruct | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama3.1-405B-Instruct | meta-llama/Llama-3.1-405B-Instruct-FP8 | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama3.1-405B-Instruct:bf16-mp16 | meta-llama/Llama-3.1-405B-Instruct | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama3.2-1B | meta-llama/Llama-3.2-1B | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama3.2-3B | meta-llama/Llama-3.2-3B | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama3.2-11B-Vision | meta-llama/Llama-3.2-11B-Vision | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama3.2-90B-Vision | meta-llama/Llama-3.2-90B-Vision | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama3.2-1B-Instruct | meta-llama/Llama-3.2-1B-Instruct | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama3.2-3B-Instruct | meta-llama/Llama-3.2-3B-Instruct | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama3.2-11B-Vision-Instruct | meta-llama/Llama-3.2-11B-Vision-Instruct | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama3.2-90B-Vision-Instruct | meta-llama/Llama-3.2-90B-Vision-Instruct | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama-Guard-3-11B-Vision | meta-llama/Llama-Guard-3-11B-Vision | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama-Guard-3-1B:int4-mp1 | meta-llama/Llama-Guard-3-1B-INT4 | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama-Guard-3-1B | meta-llama/Llama-Guard-3-1B | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama-Guard-3-8B | meta-llama/Llama-Guard-3-8B | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama-Guard-3-8B:int8-mp1 | meta-llama/Llama-Guard-3-8B-INT8 | 128K |
+----------------------------------+------------------------------------------+----------------+
| Prompt-Guard-86M | meta-llama/Prompt-Guard-86M | 128K |
+----------------------------------+------------------------------------------+----------------+
| Llama-Guard-2-8B | meta-llama/Llama-Guard-2-8B | 4K |
+----------------------------------+------------------------------------------+----------------+
Select a model
Select a desired model by running:
llama model download --source meta --model-id MODEL_ID
Specify custom URL
When the script asks for your unique custom URL, please paste the URL below
Note the unique URL will be provided for every request and save copies of the unique custom URLs provided above, they will remain valid for 48 hours to download each model up to 5 times, and requests can be submitted multiple times. An email with the download instructions will also be sent to the email address you used to request the models.
Available models
Available models for download include:
- Pretrained:
- Llama-3.2–1B
- Llama-3.2–3B
- Llama-3.2–11B-Vision
- Llama-3.2–90B-Vision
- Llama-3.1–8B
- Llama-3.1–70B
- Llama-3.1–405B-MP16
- Llama-3.1–405B-FP8
- Fine-tuned:
- Llama-3.2–1B-Instruct
- Llama-3.2–3B-Instruct
- Llama-3.2–11B-Vision-Instruct
- Llama-3.2–90B-Vision-Instruct
- Llama-3.1–8B-Instruct
- Llama-3.1–70B-Instruct
- Llama-3.1–405B-Instruct
- Llama-3.1–405B-Instruct-MP16
- Llama-3.1–405B-Instruct-FP8
- Llama Guard:
- Llama-Guard-3–1B
- Llama-Guard-3–1B-INT4
- Llama-Guard-3–11B-Vision
- Llama-Guard-3–8B
- Llama-Guard-3–8B-INT8
- Llama-Guard-2–8B
- Llama-Guard-8B
- Prompt-Guard-86M
Note for 405B:
- We are releasing multiple versions of the 405B model to accommodate its large size and facilitate multiple deployment options:
- MP16 (Model Parallel 16) is the full version of BF16 weights. These weights can only be served on multiple nodes using pipelined parallel inference. At minimum it would need 2 nodes of 8 GPUs to serve.
- MP8 (Model Parallel 8) is also the full version of BF16 weights, but can be served on a single node with 8 GPUs by using dynamic FP8 (Floating Point 8) quantization. We are providing reference code for it. You can download these weights and experiment with different quantization techniques outside of what we are providing.
- FP8 (Floating Point 8) is a quantized version of the weights. These weights can be served on a single node with 8 GPUs by using the static FP quantization. We have provided reference code for it as well.
- 405B model requires significant storage and computational resources, occupying approximately 750GB of disk storage space and necessitating two nodes on MP16 for inferencing.
Recommended tools
Code Shield
A system-level approach to safeguard tools, Code Shield adds support for inference-time filtering of insecure code produced by LLMs. This offers mitigation of insecure code suggestions risk, code interpreter abuse prevention, and secure command execution.
Now available on Github
Cybersecurity Eval
The first and most comprehensive set of open source cybersecurity safety evals for LLMs. These benchmarks are based on industry guidance and standards (e.g. CWE & MITRE ATT&CK) and built in collaboration with our security subject matter experts.
Now available on Github
Helpful tips
Please read the instructions in the GitHub repo and use the provided code examples to understand how to best interact with the models. In particular, for the fine-tuned models you must use appropriate formatting and correct system/instruction tokens to get the best results from the model.
You can find additional information about how to responsibly deploy Llama models in our Responsible Use Guide.
Review our Documentation to start building
If you need to report issues
If you or any Llama user becomes aware of any violation of our license or acceptable use policies — or any bug or issues with Llama that could lead to any such violations — please report it through one of the following means:
- Reporting issues with the model
- Giving feedback about potentially problematic output generated by the model
- Reporting bugs and security concerns
- Reporting violations of the Acceptable Use Policy: LlamaUseReport@meta.com