Reranker scores aren't as expected

by aridcolossal - opened about 11 hours ago

about 11 hours ago

•

Hi, there is another closed issue (#2) showing the scores for the example snippet below. The scores I get are not close to the ones provided in the comments and do not seem right.

from sentence_transformers import CrossEncoder, util

model_path = "ibm-granite/granite-embedding-reranker-english-r2"
# Load the Sentence Transformer model
model = CrossEncoder(model_path)

passages = [
               "Romeo and Juliet is a play by William Shakespeare.",
               "Climate change refers to long-term shifts in temperatures.",
               "Shakespeare also wrote Hamlet and Macbeth.",
               "Water is an inorganic compound with the chemical formula H2O.",
               "In liquid form, H2O is also called 'water' at standard temperature and pressure."
            ]

query = "what is the chemical formula of water?"

# encodes query and passages jointly and computes relevance score.
ranks = model.rank(query, passages, return_documents=True)

# Print document rank and relevance score
for rank in ranks:
    print(f"- #{rank['corpus_id']} ({rank['score']}): {rank['text']}")
###############################
- #3 (0.96): Water is an inorganic compound with the chemical formula H2O.
- #4 (0.87): In liquid form, H2O is also called 'water' at standard temperature and pressure.
- #1 (0.65): Climate change refers to long-term shifts in temperatures.
- #2 (0.62): Shakespeare also wrote Hamlet and Macbeth.
- #0 (0.60): Romeo and Juliet is a play by William Shakespeare.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment