TestEmbeddingModel GGUF
Recommended way to run this model:
llama-server -hf danbev/TestEmbeddingModel-GGUF --embedding --pooling none
Then the endpoint can be accessed at http://localhost:8080/embedding, for
example using curl:
curl --request POST \
--url http://localhost:8080/embedding \
--header "Content-Type: application/json" \
--data '{"input": "Hello embeddings", "embd_normalize": -1}' \
--silent
Alternatively, the llama-embeddingcommand line tool can be used:
llama-embedding -hf danbev/TestEmbeddingModel-GGUF --pooling none --embd-normalize 2 --verbose-prompt -p "Hello embeddings"
embd_normalize
When a pooling method is specified the normalization can be controlled by the
embd_normalize parameter. The default value is 2 which means that the
embeddings are normalized using the Euclidean norm (L2). Other options are:
- -1 No normalization
- 0 Max absolute
- 1 Taxicab
- 2 Euclidean/L2
- >2 P-Norm
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support