Add model card
Browse files
    	
        README.md
    ADDED
    
    | @@ -0,0 +1,58 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            ---
         | 
| 2 | 
            +
            language:
         | 
| 3 | 
            +
            - en
         | 
| 4 | 
            +
            pipeline_tag: text-generation
         | 
| 5 | 
            +
            inference: false
         | 
| 6 | 
            +
            tags:
         | 
| 7 | 
            +
            - facebook
         | 
| 8 | 
            +
            - meta
         | 
| 9 | 
            +
            - pytorch
         | 
| 10 | 
            +
            - mistral
         | 
| 11 | 
            +
            - inferentia2
         | 
| 12 | 
            +
            - neuron
         | 
| 13 | 
            +
            ---
         | 
| 14 | 
            +
            # Neuronx model for [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)
         | 
| 15 | 
            +
             | 
| 16 | 
            +
            This repository contains [**AWS Inferentia2**](https://aws.amazon.com/ec2/instance-types/inf2/) and [`neuronx`](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/) compatible checkpoints for [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf).
         | 
| 17 | 
            +
            You can find detailed information about the base model on its [Model Card](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1).
         | 
| 18 | 
            +
             | 
| 19 | 
            +
            This model has been exported to the `neuron` format using specific `input_shapes` and `compiler` parameters detailed in the paragraphs below.
         | 
| 20 | 
            +
             | 
| 21 | 
            +
            Please refer to the 🤗 `optimum-neuron` [documentation](https://huggingface.co/docs/optimum-neuron/main/en/guides/models#configuring-the-export-of-a-generative-model) for an explanation of these parameters.
         | 
| 22 | 
            +
             | 
| 23 | 
            +
            ## Usage on Amazon SageMaker
         | 
| 24 | 
            +
             | 
| 25 | 
            +
            _coming soon_
         | 
| 26 | 
            +
             | 
| 27 | 
            +
            ## Usage with 🤗 `optimum-neuron`
         | 
| 28 | 
            +
             | 
| 29 | 
            +
            ```python
         | 
| 30 | 
            +
            >>> from optimum.neuron import pipeline
         | 
| 31 | 
            +
             | 
| 32 | 
            +
            >>> p = pipeline('text-generation', 'aws-neuron/Mistral-7B-Instruct-v0.1-neuron-1x2048-2-cores')
         | 
| 33 | 
            +
            >>> p("My favorite place on earth is", max_new_tokens=64, do_sample=True, top_k=50)
         | 
| 34 | 
            +
            [{'generated_text': 'My favorite place on earth is the ocean. It is where I feel most
         | 
| 35 | 
            +
            at peace. I love to travel and see new places. I have a'}]
         | 
| 36 | 
            +
            ```
         | 
| 37 | 
            +
             | 
| 38 | 
            +
            This repository contains tags specific to versions of `neuronx`. When using with 🤗 `optimum-neuron`, use the repo revision specific to the version of `neuronx` you are using, to load the right serialized checkpoints.
         | 
| 39 | 
            +
             | 
| 40 | 
            +
            ## Arguments passed during export
         | 
| 41 | 
            +
             | 
| 42 | 
            +
            **input_shapes**
         | 
| 43 | 
            +
             | 
| 44 | 
            +
            ```json
         | 
| 45 | 
            +
            {
         | 
| 46 | 
            +
              "batch_size": 1,
         | 
| 47 | 
            +
              "sequence_length": 2048,
         | 
| 48 | 
            +
            }
         | 
| 49 | 
            +
            ```
         | 
| 50 | 
            +
             | 
| 51 | 
            +
            **compiler_args**
         | 
| 52 | 
            +
             | 
| 53 | 
            +
            ```json
         | 
| 54 | 
            +
            {
         | 
| 55 | 
            +
              "auto_cast_type": "bf16",
         | 
| 56 | 
            +
              "num_cores": 2,
         | 
| 57 | 
            +
            }
         | 
| 58 | 
            +
            ```
         | 

