nreimers
		
	commited on
		
		
					Commit 
							
							·
						
						0957858
	
1
								Parent(s):
							
							78fef10
								
upload
Browse files- README.md +61 -0
- config.json +31 -0
- pytorch_model.bin +3 -0
- special_tokens_map.json +1 -0
- tokenizer_config.json +1 -0
- vocab.txt +0 -0
    	
        README.md
    ADDED
    
    | @@ -0,0 +1,61 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            # Cross-Encoder for MS Marco
         | 
| 2 | 
            +
             | 
| 3 | 
            +
            This model was trained on the [MS Marco Passage Ranking](https://github.com/microsoft/MSMARCO-Passage-Ranking) task.
         | 
| 4 | 
            +
             | 
| 5 | 
            +
            The model can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See [SBERT.net Retrieve & Re-rank](https://www.sbert.net/examples/applications/retrieve_rerank/README.html) for more details. The training code is available here: [SBERT.net Training MS Marco](https://github.com/UKPLab/sentence-transformers/tree/master/examples/training/ms_marco)
         | 
| 6 | 
            +
             | 
| 7 | 
            +
             | 
| 8 | 
            +
            ## Usage with Transformers
         | 
| 9 | 
            +
             | 
| 10 | 
            +
            ```python
         | 
| 11 | 
            +
            from transformers import AutoTokenizer, AutoModelForSequenceClassification
         | 
| 12 | 
            +
            import torch
         | 
| 13 | 
            +
             | 
| 14 | 
            +
            model = AutoModelForSequenceClassification.from_pretrained('model_name')
         | 
| 15 | 
            +
            tokenizer = AutoTokenizer.from_pretrained('model_name')
         | 
| 16 | 
            +
             | 
| 17 | 
            +
            features = tokenizer(['How many people live in Berlin?', 'How many people live in Berlin?'], ['Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.', 'New York City is famous for the Metropolitan Museum of Art.'],  padding=True, truncation=True, return_tensors="pt")
         | 
| 18 | 
            +
             | 
| 19 | 
            +
            model.eval()
         | 
| 20 | 
            +
            with torch.no_grad():
         | 
| 21 | 
            +
                scores = model(**features).logits
         | 
| 22 | 
            +
                print(scores)
         | 
| 23 | 
            +
            ```
         | 
| 24 | 
            +
             | 
| 25 | 
            +
             | 
| 26 | 
            +
            ## Usage with SentenceTransformers
         | 
| 27 | 
            +
             | 
| 28 | 
            +
            The usage becomes easier when you have [SentenceTransformers](https://www.sbert.net/) installed. Then, you can use the pre-trained models like this:
         | 
| 29 | 
            +
            ```python
         | 
| 30 | 
            +
            from sentence_transformers import CrossEncoder
         | 
| 31 | 
            +
            model = CrossEncoder('model_name', max_length=512)
         | 
| 32 | 
            +
            scores = model.predict([('Query', 'Paragraph1'), ('Query', 'Paragraph2') , ('Query', 'Paragraph3')])
         | 
| 33 | 
            +
            ```
         | 
| 34 | 
            +
             | 
| 35 | 
            +
             | 
| 36 | 
            +
            ## Performance
         | 
| 37 | 
            +
            In the following table, we provide various pre-trained Cross-Encoders together with their performance on the [TREC Deep Learning 2019](https://microsoft.github.io/TREC-2019-Deep-Learning/) and the [MS Marco Passage Reranking](https://github.com/microsoft/MSMARCO-Passage-Ranking/) dataset. 
         | 
| 38 | 
            +
             | 
| 39 | 
            +
             | 
| 40 | 
            +
            | Model-Name        | NDCG@10 (TREC DL 19) | MRR@10 (MS Marco Dev)  | Docs / Sec |
         | 
| 41 | 
            +
            | ------------- |:-------------| -----| --- | 
         | 
| 42 | 
            +
            | **Version 2 models** | | | 
         | 
| 43 | 
            +
            | cross-encoder/ms-marco-TinyBERT-L-2-v2 | 69.84 | 32.56 | 9000
         | 
| 44 | 
            +
            | cross-encoder/ms-marco-MiniLM-L-2-v2 | 71.01 | 34.85 | 4100
         | 
| 45 | 
            +
            | cross-encoder/ms-marco-MiniLM-L-4-v2 | 73.04 | 37.70 | 2500
         | 
| 46 | 
            +
            | cross-encoder/ms-marco-MiniLM-L-6-v2 | 74.30 | 39.01 | 1800
         | 
| 47 | 
            +
            | cross-encoder/ms-marco-MiniLM-L-12-v2 | 74.31 | 39.02 | 960
         | 
| 48 | 
            +
            | **Version 1 models** | | | 
         | 
| 49 | 
            +
            | cross-encoder/ms-marco-TinyBERT-L-2  | 67.43 | 30.15  | 9000
         | 
| 50 | 
            +
            | cross-encoder/ms-marco-TinyBERT-L-4  | 68.09 | 34.50  | 2900
         | 
| 51 | 
            +
            | cross-encoder/ms-marco-TinyBERT-L-6 |  69.57 | 36.13  | 680
         | 
| 52 | 
            +
            | cross-encoder/ms-marco-electra-base | 71.99 | 36.41 | 340
         | 
| 53 | 
            +
            | **Other models** | | | 
         | 
| 54 | 
            +
            | nboost/pt-tinybert-msmarco | 63.63 | 28.80 | 2900 
         | 
| 55 | 
            +
            | nboost/pt-bert-base-uncased-msmarco | 70.94 | 34.75 | 340 
         | 
| 56 | 
            +
            | nboost/pt-bert-large-msmarco | 73.36 | 36.48 | 100 
         | 
| 57 | 
            +
            | Capreolus/electra-base-msmarco | 71.23 | 36.89 | 340 
         | 
| 58 | 
            +
            | amberoad/bert-multilingual-passage-reranking-msmarco | 68.40 | 35.54 | 330 
         | 
| 59 | 
            +
            | sebastian-hofstaetter/distilbert-cat-margin_mse-T2-msmarco | 72.82 | 37.88 | 720
         | 
| 60 | 
            +
             
         | 
| 61 | 
            +
             Note: Runtime was computed on a V100 GPU.
         | 
    	
        config.json
    ADDED
    
    | @@ -0,0 +1,31 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            {
         | 
| 2 | 
            +
              "_name_or_path": "cross-encoder/ms-marco-MiniLM-L-12-v2",
         | 
| 3 | 
            +
              "architectures": [
         | 
| 4 | 
            +
                "BertForSequenceClassification"
         | 
| 5 | 
            +
              ],
         | 
| 6 | 
            +
              "attention_probs_dropout_prob": 0.1,
         | 
| 7 | 
            +
              "gradient_checkpointing": false,
         | 
| 8 | 
            +
              "hidden_act": "gelu",
         | 
| 9 | 
            +
              "hidden_dropout_prob": 0.1,
         | 
| 10 | 
            +
              "hidden_size": 384,
         | 
| 11 | 
            +
              "id2label": {
         | 
| 12 | 
            +
                "0": "LABEL_0"
         | 
| 13 | 
            +
              },
         | 
| 14 | 
            +
              "initializer_range": 0.02,
         | 
| 15 | 
            +
              "intermediate_size": 1536,
         | 
| 16 | 
            +
              "label2id": {
         | 
| 17 | 
            +
                "LABEL_0": 0
         | 
| 18 | 
            +
              },
         | 
| 19 | 
            +
              "layer_norm_eps": 1e-12,
         | 
| 20 | 
            +
              "max_position_embeddings": 512,
         | 
| 21 | 
            +
              "model_type": "bert",
         | 
| 22 | 
            +
              "num_attention_heads": 12,
         | 
| 23 | 
            +
              "num_hidden_layers": 2,
         | 
| 24 | 
            +
              "pad_token_id": 0,
         | 
| 25 | 
            +
              "position_embedding_type": "absolute",
         | 
| 26 | 
            +
              "transformers_version": "4.4.2",
         | 
| 27 | 
            +
              "type_vocab_size": 2,
         | 
| 28 | 
            +
              "use_cache": true,
         | 
| 29 | 
            +
              "vocab_size": 30522,
         | 
| 30 | 
            +
              "sbert_ce_default_activation_function": "torch.nn.modules.linear.Identity"
         | 
| 31 | 
            +
            }
         | 
    	
        pytorch_model.bin
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:e92ec9f854a5d8651f86db03375c55f4e4f893b7518177d4f2c8e31e3b9013a1
         | 
| 3 | 
            +
            size 62484521
         | 
    	
        special_tokens_map.json
    ADDED
    
    | @@ -0,0 +1 @@ | |
|  | 
|  | |
| 1 | 
            +
            {"unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]"}
         | 
    	
        tokenizer_config.json
    ADDED
    
    | @@ -0,0 +1 @@ | |
|  | 
|  | |
| 1 | 
            +
            {"do_lower_case": true, "unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]", "tokenize_chinese_chars": true, "strip_accents": null, "special_tokens_map_file": "/home/ukp-reimers/.cache/huggingface/transformers/1e5909e4dfaa904617797ed35a6105a23daa56cbefca48fef329f772584699fb.dd8bd9bfd3664b530ea4e645105f557769387b3da9f79bdb55ed556bdd80611d", "name_or_path": "../output-cat/microsoft_MiniLM-L12-H384-uncased-2021-04-03_22-57-29", "do_basic_tokenize": true, "never_split": null}
         | 
    	
        vocab.txt
    ADDED
    
    | The diff for this file is too large to render. 
		See raw diff | 
|  | 
