|
|
--- |
|
|
title: Number Guessing Game Environment |
|
|
emoji: 🎯 |
|
|
colorFrom: blue |
|
|
colorTo: purple |
|
|
sdk: docker |
|
|
pinned: false |
|
|
app_port: 8000 |
|
|
tags: |
|
|
- openenv |
|
|
- reinforcement-learning |
|
|
- game |
|
|
- binary-search |
|
|
base_path: /web |
|
|
--- |
|
|
|
|
|
# Number Guessing Game Environment |
|
|
|
|
|
A simple OpenEnv environment where an agent learns to guess a secret number between 1 and 100 with limited attempts. |
|
|
|
|
|
## Description |
|
|
|
|
|
The agent receives hints after each guess ("higher", "lower", or "correct") and must find the secret number within 10 attempts. This environment is perfect for: |
|
|
- Teaching RL agents binary search strategies |
|
|
- Learning the OpenEnv framework |
|
|
- Benchmarking simple reasoning capabilities |
|
|
|
|
|
## Environment Details |
|
|
|
|
|
**Action Space:** |
|
|
- `GuessAction` with a single field: |
|
|
- `guess`: Integer between 1 and 100 |
|
|
|
|
|
**Observation Space:** |
|
|
- `GuessObservation` with fields: |
|
|
- `hint`: String ("correct", "higher", "lower", or "invalid") |
|
|
- `attempts_remaining`: Number of guesses left |
|
|
- `guess_history`: List of all previous guesses |
|
|
- `done`: Boolean indicating if episode is complete |
|
|
- `reward`: Float reward for the action |
|
|
|
|
|
**Reward Structure:** |
|
|
- `+10.0`: Correct guess (episode ends) |
|
|
- `+0.1`: Valid guess that narrows the range |
|
|
- `-1.0`: Invalid guess (out of bounds) |
|
|
- `-5.0`: Failed to guess within max attempts |
|
|
|
|
|
**Episode Termination:** |
|
|
- Agent guesses correctly |
|
|
- Agent runs out of attempts (10 by default) |
|
|
|
|
|
## Quick Start |
|
|
|
|
|
### Using the Client |
|
|
|
|
|
```python |
|
|
from envs.number_guess_env import NumberGuessEnv, GuessAction |
|
|
|
|
|
# Connect to a running server |
|
|
client = NumberGuessEnv(base_url="http://localhost:8000") |
|
|
|
|
|
# Or use Docker (automatically starts container) |
|
|
# client = NumberGuessEnv.from_docker_image("number-guess-env:latest") |
|
|
|
|
|
# Start a new game |
|
|
result = client.reset() |
|
|
print(result.observation.hint) # "Guess a number between 1 and 100!" |
|
|
|
|
|
# Make guesses |
|
|
result = client.step(GuessAction(guess=50)) |
|
|
print(f"Hint: {result.observation.hint}") |
|
|
print(f"Reward: {result.reward}") |
|
|
print(f"Attempts left: {result.observation.attempts_remaining}") |
|
|
|
|
|
# Continue until done |
|
|
while not result.done: |
|
|
# Your agent logic here |
|
|
guess = 75 # Example |
|
|
result = client.step(GuessAction(guess=guess)) |
|
|
print(f"Hint: {result.observation.hint}") |
|
|
|
|
|
client.close() |
|
|
``` |
|
|
|
|
|
### Training an Agent |
|
|
|
|
|
```python |
|
|
from envs.number_guess_env import NumberGuessEnv, GuessAction |
|
|
|
|
|
env = NumberGuessEnv.from_docker_image("number-guess-env:latest") |
|
|
|
|
|
for episode in range(100): |
|
|
result = env.reset() |
|
|
total_reward = 0 |
|
|
|
|
|
# Simple binary search strategy |
|
|
low, high = 1, 100 |
|
|
|
|
|
while not result.done: |
|
|
guess = (low + high) // 2 |
|
|
result = env.step(GuessAction(guess=guess)) |
|
|
total_reward += result.reward |
|
|
|
|
|
if result.observation.hint == "higher": |
|
|
low = guess + 1 |
|
|
elif result.observation.hint == "lower": |
|
|
high = guess - 1 |
|
|
|
|
|
print(f"Episode {episode}: Total reward = {total_reward}") |
|
|
|
|
|
env.close() |
|
|
``` |
|
|
|
|
|
## Building and Running |
|
|
|
|
|
### Build Docker Image |
|
|
|
|
|
```bash |
|
|
docker build -t number-guess-env:latest server/ |
|
|
``` |
|
|
|
|
|
### Run Server Locally |
|
|
|
|
|
```bash |
|
|
# Using uvicorn directly |
|
|
cd server |
|
|
uvicorn app:app --host 0.0.0.0 --port 8000 |
|
|
|
|
|
# Or using Docker |
|
|
docker run -p 8000:8000 number-guess-env:latest |
|
|
``` |
|
|
|
|
|
### Test the Server |
|
|
|
|
|
```bash |
|
|
# Reset |
|
|
curl -X POST http://localhost:8000/reset |
|
|
|
|
|
# Step |
|
|
curl -X POST http://localhost:8000/step \ |
|
|
-H "Content-Type: application/json" \ |
|
|
-d '{"guess": 50}' |
|
|
|
|
|
# Get state |
|
|
curl http://localhost:8000/state |
|
|
``` |
|
|
|
|
|
## Environment Customization |
|
|
|
|
|
You can customize the environment parameters: |
|
|
|
|
|
```python |
|
|
from envs.number_guess_env.server.number_guess_environment import NumberGuessEnvironment |
|
|
|
|
|
# Custom range and attempts |
|
|
env = NumberGuessEnvironment( |
|
|
max_attempts=15, |
|
|
min_number=1, |
|
|
max_number=1000 |
|
|
) |
|
|
``` |
|
|
|
|
|
## API Endpoints |
|
|
|
|
|
When running as a server, the following endpoints are available: |
|
|
|
|
|
- `POST /reset` - Start a new game with a new secret number |
|
|
- `POST /step` - Submit a guess and receive a hint |
|
|
- `GET /state` - Get current episode state (episode_id, step_count) |
|
|
- `GET /health` - Health check endpoint |
|
|
- `GET /` - API documentation |
|
|
|
|
|
## License |
|
|
|
|
|
BSD 3-Clause License (see LICENSE file) |
|
|
|