Add library_name, fix pipeline_tag
Browse filesThis PR improves the model card by ensuring:
- there's a proper `pipeline_tag`, ensuring the model can be found at https://huggingface.co/models?pipeline_tag=reinforcement-learning
- the proper library is added, enabling "how to use" in the top right.
README.md
CHANGED
|
@@ -1,6 +1,9 @@
|
|
| 1 |
---
|
| 2 |
license: mit
|
|
|
|
|
|
|
| 3 |
---
|
|
|
|
| 4 |
<div align="center">
|
| 5 |
<div style="margin-bottom: 30px"> <!-- ๅๅฐๅบ้จ้ด่ท -->
|
| 6 |
<div style="display: flex; flex-direction: column; align-items: center; gap: 8px"> <!-- ๆฐๅขๅ็ดๅธๅฑๅฎนๅจ -->
|
|
@@ -10,15 +13,15 @@ license: mit
|
|
| 10 |
</div>
|
| 11 |
<h2 style="font-size: 32px; margin: 20px 0;">Skill Expansion and Composition in Parameter Space</h2>
|
| 12 |
<h4 style="color: #666; margin-bottom: 25px;">International Conference on Learning Representation (ICLR), 2025</h4>
|
| 13 |
-
<p align="center" style="margin:
|
| 14 |
<a href="https://arxiv.org/abs/2502.05932">
|
| 15 |
<img src="https://img.shields.io/badge/arXiv-2502.05932-b31b1b.svg">
|
| 16 |
</a>
|
| 17 |
-
|
| 18 |
<a href="https://ltlhuuu.github.io/PSEC/">
|
| 19 |
<img src="https://img.shields.io/badge/๐_Project_Page-PSEC-blue.svg">
|
| 20 |
</a>
|
| 21 |
-
|
| 22 |
<a href="https://arxiv.org/pdf/2502.05932.pdf">
|
| 23 |
<img src="https://img.shields.io/badge/๐_Paper-PSEC-green.svg">
|
| 24 |
</a>
|
|
@@ -31,10 +34,10 @@ license: mit
|
|
| 31 |
๐ฅ Official Implementation
|
| 32 |
</p>
|
| 33 |
<p style="font-size: 18px; max-width: 800px; margin: 0 auto;">
|
| 34 |
-
|
| 35 |
</p>
|
| 36 |
</div>
|
| 37 |
-
<div align="
|
| 38 |
<p style="font-size: 15px; font-weight: 600; margin-bottom: 20px;">
|
| 39 |
๐ <b>Facilitate</b> efficient and flexible skill expansion and composition <br>
|
| 40 |
๐ <b>Iteratively evolve</b> the agents' capabilities<br>
|
|
@@ -99,18 +102,18 @@ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/.mujoco/mujoco210/bin
|
|
| 99 |
```
|
| 100 |
## Run experiments
|
| 101 |
### Pretrain
|
| 102 |
-
Pretrain the model with the following command. Meanwhile there are pre-trained models, you can download them from [here](https://
|
| 103 |
```python
|
| 104 |
export XLA_PYTHON_CLIENT_PREALLOCATE=False
|
| 105 |
CUDA_VISIBLE_DEVICES=0 python launcher/examples/train_pretrain.py --variant 0 --seed 0
|
| 106 |
```
|
| 107 |
### LoRA finetune
|
| 108 |
-
Train the skill policies with LoRA to achieve skill expansion. Meanwhile there are pre-trained models, you can download them from [here](https://
|
| 109 |
```python
|
| 110 |
CUDA_VISIBLE_DEVICES=0 python launcher/examples/train_lora_finetune.py --com_method 0 --model_cls 'LoRALearner' --variant 0 --seed 0
|
| 111 |
```
|
| 112 |
### Context-aware Composition
|
| 113 |
-
Train the context-aware modular to adaptively leverage different skill knowledge to solve the tasks. You can download the pretrained model and datasets from [here](https://
|
| 114 |
```python
|
| 115 |
CUDA_VISIBLE_DEVICES=0 python launcher/examples/train_lora_finetune.py --com_method 0 --model_cls 'LoRASLearner' --variant 0 --seed 0
|
| 116 |
```
|
|
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
+
library_name: diffusers
|
| 4 |
+
pipeline_tag: reinforcement-learning
|
| 5 |
---
|
| 6 |
+
|
| 7 |
<div align="center">
|
| 8 |
<div style="margin-bottom: 30px"> <!-- ๅๅฐๅบ้จ้ด่ท -->
|
| 9 |
<div style="display: flex; flex-direction: column; align-items: center; gap: 8px"> <!-- ๆฐๅขๅ็ดๅธๅฑๅฎนๅจ -->
|
|
|
|
| 13 |
</div>
|
| 14 |
<h2 style="font-size: 32px; margin: 20px 0;">Skill Expansion and Composition in Parameter Space</h2>
|
| 15 |
<h4 style="color: #666; margin-bottom: 25px;">International Conference on Learning Representation (ICLR), 2025</h4>
|
| 16 |
+
<p align="center" style="margin: 30px 0;">
|
| 17 |
<a href="https://arxiv.org/abs/2502.05932">
|
| 18 |
<img src="https://img.shields.io/badge/arXiv-2502.05932-b31b1b.svg">
|
| 19 |
</a>
|
| 20 |
+
|
| 21 |
<a href="https://ltlhuuu.github.io/PSEC/">
|
| 22 |
<img src="https://img.shields.io/badge/๐_Project_Page-PSEC-blue.svg">
|
| 23 |
</a>
|
| 24 |
+
|
| 25 |
<a href="https://arxiv.org/pdf/2502.05932.pdf">
|
| 26 |
<img src="https://img.shields.io/badge/๐_Paper-PSEC-green.svg">
|
| 27 |
</a>
|
|
|
|
| 34 |
๐ฅ Official Implementation
|
| 35 |
</p>
|
| 36 |
<p style="font-size: 18px; max-width: 800px; margin: 0 auto;">
|
| 37 |
+
<img src="assets/icon.svg" width="20"> <b>PSEC</b> is a novel framework designed to:
|
| 38 |
</p>
|
| 39 |
</div>
|
| 40 |
+
<div align="left">
|
| 41 |
<p style="font-size: 15px; font-weight: 600; margin-bottom: 20px;">
|
| 42 |
๐ <b>Facilitate</b> efficient and flexible skill expansion and composition <br>
|
| 43 |
๐ <b>Iteratively evolve</b> the agents' capabilities<br>
|
|
|
|
| 102 |
```
|
| 103 |
## Run experiments
|
| 104 |
### Pretrain
|
| 105 |
+
Pretrain the model with the following command. Meanwhile there are pre-trained models, you can download them from [here](https://huggingface.co/LTL07/PSEC).
|
| 106 |
```python
|
| 107 |
export XLA_PYTHON_CLIENT_PREALLOCATE=False
|
| 108 |
CUDA_VISIBLE_DEVICES=0 python launcher/examples/train_pretrain.py --variant 0 --seed 0
|
| 109 |
```
|
| 110 |
### LoRA finetune
|
| 111 |
+
Train the skill policies with LoRA to achieve skill expansion. Meanwhile there are pre-trained models, you can download them from [here](https://huggingface.co/LTL07/PSEC).
|
| 112 |
```python
|
| 113 |
CUDA_VISIBLE_DEVICES=0 python launcher/examples/train_lora_finetune.py --com_method 0 --model_cls 'LoRALearner' --variant 0 --seed 0
|
| 114 |
```
|
| 115 |
### Context-aware Composition
|
| 116 |
+
Train the context-aware modular to adaptively leverage different skill knowledge to solve the tasks. You can download the pretrained model and datasets from [here](https://huggingface.co/LTL07/PSEC). Then, run the following command,
|
| 117 |
```python
|
| 118 |
CUDA_VISIBLE_DEVICES=0 python launcher/examples/train_lora_finetune.py --com_method 0 --model_cls 'LoRASLearner' --variant 0 --seed 0
|
| 119 |
```
|