OpenDILabCommunity
/

LunarLander-v2-MuZero

@@ -21,7 +21,7 @@ model-index:
       type: LunarLander-v2
     metrics:
     - type: mean_reward
-      value: -41.76 +/- 128.15
       name: mean_reward
 ---
@@ -129,7 +129,7 @@ from huggingface_ding import push_model_to_hub
 # Instantiate the agent
 agent = MuZeroAgent(env_id="LunarLander-v2", exp_name="LunarLander-v2-MuZero")
 # Train the agent
-return_ = agent.train(step=int(10000))
 # Push model to huggingface hub
 push_model_to_hub(
     agent=agent.best,
@@ -149,7 +149,7 @@ pip3 install LightZero
     repo_id="OpenDILabCommunity/LunarLander-v2-MuZero",
     platform_info="[LightZero](https://github.com/opendilab/LightZero) and [DI-engine](https://github.com/opendilab/di-engine)",
     model_description="**LightZero** is an efficient, easy-to-understand open-source toolkit that merges Monte Carlo Tree Search (MCTS) with Deep Reinforcement Learning (RL), simplifying their integration for developers and researchers. More details are in paper [LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios](https://huggingface.co/papers/2310.08348).",
-    create_repo=True
 )
 ```
@@ -164,6 +164,7 @@ pip3 install LightZero
 exp_config = {
     'main_config': {
         'exp_name': 'LunarLander-v2-MuZero',
         'env': {
             'env_id': 'LunarLander-v2',
             'continuous': False,
@@ -199,6 +200,7 @@ exp_config = {
             'collector_env_num': 8,
             'evaluator_env_num': 3,
             'env_type': 'not_board_games',
             'battle_mode': 'play_with_bot_mode',
             'monitor_extra_statistics': True,
             'game_segment_length': 200,
@@ -294,13 +296,13 @@ exp_config = {
 - **Demo:** [video](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-MuZero/blob/main/replay.mp4)
 <!-- Provide the size information for the model. -->
 - **Parameters total size:** 15479.39 KB
-- **Last Update Date:** 2023-12-11
 ## Environments
 <!-- Address questions around what environment the model is intended to be trained and deployed at, including the necessary information needed to be provided for future users. -->
 - **Benchmark:** OpenAI/Gym/Box2d
 - **Task:** LunarLander-v2
 - **Gym version:** 0.25.1
-- **DI-engine version:** v0.4.9
-- **PyTorch version:** 2.1.1+cu121
 - **Doc**: [Environments link](<TODO>)

       type: LunarLander-v2
     metrics:
     - type: mean_reward
+      value: 206.55 +/- 102.39
       name: mean_reward
 ---
 # Instantiate the agent
 agent = MuZeroAgent(env_id="LunarLander-v2", exp_name="LunarLander-v2-MuZero")
 # Train the agent
+return_ = agent.train(step=int(5000000))
 # Push model to huggingface hub
 push_model_to_hub(
     agent=agent.best,
     repo_id="OpenDILabCommunity/LunarLander-v2-MuZero",
     platform_info="[LightZero](https://github.com/opendilab/LightZero) and [DI-engine](https://github.com/opendilab/di-engine)",
     model_description="**LightZero** is an efficient, easy-to-understand open-source toolkit that merges Monte Carlo Tree Search (MCTS) with Deep Reinforcement Learning (RL), simplifying their integration for developers and researchers. More details are in paper [LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios](https://huggingface.co/papers/2310.08348).",
+    create_repo=False
 )
 ```
 exp_config = {
     'main_config': {
         'exp_name': 'LunarLander-v2-MuZero',
+        'seed': 0,
         'env': {
             'env_id': 'LunarLander-v2',
             'continuous': False,
             'collector_env_num': 8,
             'evaluator_env_num': 3,
             'env_type': 'not_board_games',
+            'action_type': 'fixed_action_space',
             'battle_mode': 'play_with_bot_mode',
             'monitor_extra_statistics': True,
             'game_segment_length': 200,
 - **Demo:** [video](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-MuZero/blob/main/replay.mp4)
 <!-- Provide the size information for the model. -->
 - **Parameters total size:** 15479.39 KB
+- **Last Update Date:** 2023-12-21
 ## Environments
 <!-- Address questions around what environment the model is intended to be trained and deployed at, including the necessary information needed to be provided for future users. -->
 - **Benchmark:** OpenAI/Gym/Box2d
 - **Task:** LunarLander-v2
 - **Gym version:** 0.25.1
+- **DI-engine version:** v0.5.0
+- **PyTorch version:** 2.0.1+cu117
 - **Doc**: [Environments link](<TODO>)