OpenDILabCommunity
/

PongNoFrameskip-v4-EfficientZero

Reinforcement Learning

deep-reinforcement-learning

PongNoFrameskip-v4

Model card Files Files and versions

zjowowen commited on Jan 16, 2024

Commit

0f1f5bf

·

verified ·

1 Parent(s): 5bb9588

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -21,7 +21,7 @@ model-index:
       type: PongNoFrameskip-v4
     metrics:
     - type: mean_reward
-      value: 5.8 +/- 4.92
       name: mean_reward
 ---
@@ -129,7 +129,7 @@ from huggingface_ding import push_model_to_hub
 # Instantiate the agent
 agent = EfficientZeroAgent(env_id="PongNoFrameskip-v4", exp_name="PongNoFrameskip-v4-EfficientZero")
 # Train the agent
-return_ = agent.train(step=int(500000))
 # Push model to huggingface hub
 push_model_to_hub(
     agent=agent.best,
@@ -289,7 +289,7 @@ exp_config = {
 - **Demo:** [video](https://huggingface.co/OpenDILabCommunity/PongNoFrameskip-v4-EfficientZero/blob/main/replay.mp4)
 <!-- Provide the size information for the model. -->
 - **Parameters total size:** 33023.14 KB
-- **Last Update Date:** 2024-01-08
 ## Environments
 <!-- Address questions around what environment the model is intended to be trained and deployed at, including the necessary information needed to be provided for future users. -->

       type: PongNoFrameskip-v4
     metrics:
     - type: mean_reward
+      value: 20.4 +/- 0.66
       name: mean_reward
 ---
 # Instantiate the agent
 agent = EfficientZeroAgent(env_id="PongNoFrameskip-v4", exp_name="PongNoFrameskip-v4-EfficientZero")
 # Train the agent
+return_ = agent.train(step=int(2000000))
 # Push model to huggingface hub
 push_model_to_hub(
     agent=agent.best,
 - **Demo:** [video](https://huggingface.co/OpenDILabCommunity/PongNoFrameskip-v4-EfficientZero/blob/main/replay.mp4)
 <!-- Provide the size information for the model. -->
 - **Parameters total size:** 33023.14 KB
+- **Last Update Date:** 2024-01-16
 ## Environments
 <!-- Address questions around what environment the model is intended to be trained and deployed at, including the necessary information needed to be provided for future users. -->