TestEmbeddingModel GGUF

Recommended way to run this model:

llama-server -hf danbev/TestEmbeddingModel-GGUF --embedding --pooling none

Then the endpoint can be accessed at http://localhost:8080/embedding, for example using curl:

curl --request POST \
    --url http://localhost:8080/embedding \
    --header "Content-Type: application/json" \
    --data '{"input": "Hello embeddings", "embd_normalize": -1}' \
    --silent

Alternatively, the llama-embeddingcommand line tool can be used:

llama-embedding -hf danbev/TestEmbeddingModel-GGUF --pooling none --embd-normalize 2 --verbose-prompt -p "Hello embeddings"

embd_normalize

When a pooling method is specified the normalization can be controlled by the embd_normalize parameter. The default value is 2 which means that the embeddings are normalized using the Euclidean norm (L2). Other options are:

-1 No normalization
0 Max absolute
1 Taxicab
2 Euclidean/L2
>2 P-Norm

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support