yzhouchen001 commited on
Commit
2500245
Β·
1 Parent(s): ab6e770

cleaned up description

Browse files
Files changed (2) hide show
  1. README.md +48 -24
  2. app.py +2 -2
README.md CHANGED
@@ -12,25 +12,44 @@ python_version: 3.11.7
12
 
13
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
14
 
15
- # MultiView Projection (MVP) for Spectra Annotation
16
-
17
- ### Yan Zhou Chen, Soha Hassoun
18
- #### Department of Computer Science, Tufts University
19
- This repository provides the implementation of MultiView Projection (MVP). MVP can be used to rank a set of molecular candidates given a spectrum.
20
-
21
- ## Table of Contents
22
- 1. [Install & setup]
23
- 2. [Data prep]
24
- 3. [MassSpecGym data download]
25
- 4. [Use our pretrained model]
26
- 5. [Training from scratch]
27
- 6. [References]
28
-
29
- ## Install & setup
30
- 1. Clone the repository: git clone <REPO_link>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
  2. Install evironment or only key packages:
32
  ```
33
- conda env create -f environment.yml
 
 
34
  ```
35
  #### Key packages
36
  - python
@@ -44,7 +63,9 @@ conda env create -f environment.yml
44
  - massspecgym
45
  - lightning
46
 
47
- ## Data prep
 
 
48
  We provide sample spectra data and candidates in `data/sample`.
49
  For preprocessing:
50
  1. If using formSpec, compute subformula labels
@@ -75,7 +96,7 @@ python test.py --param_pth params_binnedSpec.yaml
75
  ```
76
 
77
  We provide a notebook showing sample result files in `notebooks/demo.ipynb`
78
-
79
  ## MassSpecGym data download
80
  Our model is trained on [MassSpecGym dataset](https://github.com/pluskal-lab/MassSpecGym). Follow their instruction to download the spectra and candidate dataset.
81
 
@@ -83,8 +104,7 @@ You can preprocess the MassSpecGym dataset as descirbed in the above section or
83
  ```
84
  mkdir data/msgym/
85
  cd data/msgym
86
- wget
87
- wget
88
  ```
89
  ## Training from scratch
90
  To train a model from scratch:
@@ -98,10 +118,14 @@ python train.py --param_pth params_formSpec.yaml
98
  # If using binnedSpec
99
  python train.py --param_pth params_binnedSpec.yaml
100
  ```
 
101
 
102
- ## References
 
 
 
103
 
 
 
104
 
105
- #### Contact
106
107
  =======
 
12
 
13
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
14
 
15
+ # πŸ† MultiView Projection (MVP) for Spectra Annotation
16
+
17
+
18
+ ### Authors
19
+ **Yan Zhou Chen, Soha Hassoun**
20
+ Department of Computer Science, Tufts University
21
+
22
+ ---
23
+
24
+ MVP is a framework for **ranking molecular candidates given a spectrum**. This repository provides the official implementation, pretrained models, and utilities for data preparation and training.
25
+
26
+ ---
27
+
28
+ ## πŸ“‘ Table of Contents
29
+ 0. [Quick Test](#quick-test)
30
+ 1. [Install & Setup](#install--setup)
31
+ 2. [Data Preparation](#data-prep)
32
+ 3. [MassSpecGym Data Download](#massspecgym-data-download)
33
+ 4. [Using the Pretrained Model](#use-our-pretrained-model)
34
+ 5. [Training from Scratch](#training-from-scratch)
35
+ 6. [References](#references)
36
+
37
+ ---
38
+
39
+
40
+ ## πŸš€ Quick Test
41
+ Run MVP instantly with our [interactive app](https://huggingface.co/spaces/HassounLab/MVP) for small-scale experiments.
42
+
43
+ ---
44
+
45
+
46
+ ## βš™οΈ Install & setup
47
+ 1. Clone the repository: `git clone https://huggingface.co/spaces/HassounLab/MVP/`
48
  2. Install evironment or only key packages:
49
  ```
50
+ conda create -n mvp python=3.11
51
+ conda activate mvp
52
+ pip install -r requirements.txt
53
  ```
54
  #### Key packages
55
  - python
 
63
  - massspecgym
64
  - lightning
65
 
66
+ ---
67
+
68
+ ## πŸ“‚ Data prep
69
  We provide sample spectra data and candidates in `data/sample`.
70
  For preprocessing:
71
  1. If using formSpec, compute subformula labels
 
96
  ```
97
 
98
  We provide a notebook showing sample result files in `notebooks/demo.ipynb`
99
+ ---
100
  ## MassSpecGym data download
101
  Our model is trained on [MassSpecGym dataset](https://github.com/pluskal-lab/MassSpecGym). Follow their instruction to download the spectra and candidate dataset.
102
 
 
104
  ```
105
  mkdir data/msgym/
106
  cd data/msgym
107
+ wget https://zenodo.org/records/15223987/files/msgym_preprocessed.zip?download=1
 
108
  ```
109
  ## Training from scratch
110
  To train a model from scratch:
 
118
  # If using binnedSpec
119
  python train.py --param_pth params_binnedSpec.yaml
120
  ```
121
+ ---
122
 
123
+ ## πŸ“š References
124
+ Preprint:[Learning from All Views: A Multiview Contrastive Framework for Metabolite Annotation](https://www.biorxiv.org/content/10.1101/2025.11.12.688047v1)
125
+
126
+ ---
127
 
128
+ ## πŸ“§ Contact
129
+ For questions, reach out to: [email protected]
130
 
 
 
131
  =======
app.py CHANGED
@@ -28,8 +28,8 @@ st.markdown("""
28
  This web app lets you test our trained model on your own data.
29
 
30
  ### πŸ“š References
31
- πŸ”— **Paper:** [Read the publication here](https://github.com/HassounLab/MVP)
32
- πŸ“¦ **Source Code:** [GitHub Repository](https://github.com/HassounLab/MVP)
33
 
34
  ---
35
 
 
28
  This web app lets you test our trained model on your own data.
29
 
30
  ### πŸ“š References
31
+ πŸ”— **Preprint:** [Learning from All Views: A Multiview Contrastive Framework for Metabolite Annotation](https://www.biorxiv.org/content/10.1101/2025.11.12.688047v1)
32
+ πŸ“¦ **Source Code:** [Hugging Face Repository](https://huggingface.co/spaces/HassounLab/MVP)
33
 
34
  ---
35