Model-Requests / README.md
Lewdiculous's picture
update-readme
9508765 verified
metadata
license: apache-2.0
tags:
  - requests
  - gguf
  - quantized

requests-banner/png

Status:

Quant-Requests are PAUSED momentarily due to external circumstances.
I sincerely apologize for disrupting your experience!

Welcome to my GGUF-IQ-Imatrix Model Quantization Requests card!

Please read everything.

This card is meant only to request GGUF-IQ-Imatrix quants for models that meet the requirements below.

Requirements to request GGUF-Imatrix model quantizations:

For the model:

  • Maximum model parameter size of 11B 12B. Small note is that models sizes larger than 8B parameters may take longer to process and upload than the smaller ones.
    At the moment I am unable to accept requests for larger models due to hardware/time limitations.
    Preferably for Mistral and LLama-3 based models in the creative/roleplay niche.
    If you need quants for a bigger model, you can try requesting at mradermacher's. He's doing an amazing work.

Important:

  • Fill the request template as outlined in the next section.

How to request a model quantization:

  1. Open a New Discussion titled "Request: Model-Author/Model-Name", for example, "Request: Nitral-AI/Infinitely-Laydiculous-7B", without the quotation marks.

  2. Include the following template in your new discussion post, you can just copy and paste it as is, and fill the required information by replacing the {{placeholders}} (example request here):

**[Required] Model name:** <br>
{{replace-this}}

**[Required] Model link:** <br>
{{replace-this}}

**[Required] Brief description:** <br>
{{replace-this}}

**[Required] An image/direct image link to represent the model (square shaped):** <br>
{{replace-this}}

**[Optional] Additonal quants (if you want any):** <br>

<!-- Keep in mind that anything bellow I/Q3 isn't recommended,   -->
<!-- since for these smaller models the results will likely be   -->
<!-- highly incoherent rendering them unusable for your needs.   -->


Default list of quants for reference:

        "IQ3_M", "IQ3_XXS",
        "Q4_0", "Q4_K_M", "Q4_K_S", "IQ4_XS",
        "Q5_K_M", "Q5_K_S",
        "Q6_K",
        "Q8_0"