AWS Trainium & Inferentia documentation
Optimum Neuron Container
Optimum Neuron Container
We provide pre-built Optimum Neuron containers for Amazon SageMaker. These containers come with all of the Hugging Face libraries and dependencies pre-installed, so you can start using them right away. We have containers for training and inference, and optimized text generation containers with TGI. The table is up to date and only includes the latest versions of each container. You can find older versions in the Deep Learning Container Release Notes
It is possible to ure the get_huggingface_llm_image_uri
function from the sagemaker
Python SDK to retrieve the Text Generation Inference URI for the container you want to use.
from sagemaker.image_uri_config import get_huggingface_llm_image_uri
llm_image = get_huggingface_llm_image_uri("huggingface-neuronx")
If you have the Optimum Neuron package installed, you can use the function image_uri
to retrieve the image URI for the container you want to use. The result is the same as the one retrieved by the sagemaker
Python SDK, but the image URI retrieved can be newer than the one reported by the sagemaker
Python SDK.
from optimum.neuron.utils import ecr
# retrieve the llm image uri
llm_image = ecr.image_uri("tgi")
print(f"llm image uri: {llm_image}")
Available Optimum Neuron Containers
Type | Optimum Version | Image URI |
---|---|---|
Training | 0.3.0 | 763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-training-neuronx:2.7.0-transformers4.51.0-neuronx-py310-sdk2.24.1-ubuntu22.04 |
Inference | 0.3.0 | 763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-inference-neuronx:2.7.1-transformers4.51.3-neuronx-py310-sdk2.24.1-ubuntu22.04 |
Text Generation Inference | 0.3.0 | 763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-tgi-inference:2.7.0-optimum3.3.6-neuronx-py310-ubuntu22.04 |
Please replace 763104351884
with the correct AWS account ID and region
with the AWS region you are working in.