Reranker scores aren't as expected
#5
by
aridcolossal
- opened
Hi, there is another closed issue (#2) showing the scores for the example snippet below. The scores I get are not close to the ones provided in the comments and do not seem right.
from sentence_transformers import CrossEncoder, util
model_path = "ibm-granite/granite-embedding-reranker-english-r2"
# Load the Sentence Transformer model
model = CrossEncoder(model_path)
passages = [
"Romeo and Juliet is a play by William Shakespeare.",
"Climate change refers to long-term shifts in temperatures.",
"Shakespeare also wrote Hamlet and Macbeth.",
"Water is an inorganic compound with the chemical formula H2O.",
"In liquid form, H2O is also called 'water' at standard temperature and pressure."
]
query = "what is the chemical formula of water?"
# encodes query and passages jointly and computes relevance score.
ranks = model.rank(query, passages, return_documents=True)
# Print document rank and relevance score
for rank in ranks:
print(f"- #{rank['corpus_id']} ({rank['score']}): {rank['text']}")
###############################
- #3 (0.96): Water is an inorganic compound with the chemical formula H2O.
- #4 (0.87): In liquid form, H2O is also called 'water' at standard temperature and pressure.
- #1 (0.65): Climate change refers to long-term shifts in temperatures.
- #2 (0.62): Shakespeare also wrote Hamlet and Macbeth.
- #0 (0.60): Romeo and Juliet is a play by William Shakespeare.