File size: 4,162 Bytes
87456db
4d0e1b9
 
 
 
87456db
 
4d0e1b9
 
 
 
 
 
1747df2
87456db
 
4d0e1b9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
---
title: Number Guessing Game Environment
emoji: 🎯
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
app_port: 8000
tags:
  - openenv
  - reinforcement-learning
  - game
  - binary-search
base_path: /web
---

# Number Guessing Game Environment

A simple OpenEnv environment where an agent learns to guess a secret number between 1 and 100 with limited attempts.

## Description

The agent receives hints after each guess ("higher", "lower", or "correct") and must find the secret number within 10 attempts. This environment is perfect for:
- Teaching RL agents binary search strategies
- Learning the OpenEnv framework
- Benchmarking simple reasoning capabilities

## Environment Details

**Action Space:**
- `GuessAction` with a single field:
  - `guess`: Integer between 1 and 100

**Observation Space:**
- `GuessObservation` with fields:
  - `hint`: String ("correct", "higher", "lower", or "invalid")
  - `attempts_remaining`: Number of guesses left
  - `guess_history`: List of all previous guesses
  - `done`: Boolean indicating if episode is complete
  - `reward`: Float reward for the action

**Reward Structure:**
- `+10.0`: Correct guess (episode ends)
- `+0.1`: Valid guess that narrows the range
- `-1.0`: Invalid guess (out of bounds)
- `-5.0`: Failed to guess within max attempts

**Episode Termination:**
- Agent guesses correctly
- Agent runs out of attempts (10 by default)

## Quick Start

### Using the Client

```python
from envs.number_guess_env import NumberGuessEnv, GuessAction

# Connect to a running server
client = NumberGuessEnv(base_url="http://localhost:8000")

# Or use Docker (automatically starts container)
# client = NumberGuessEnv.from_docker_image("number-guess-env:latest")

# Start a new game
result = client.reset()
print(result.observation.hint)  # "Guess a number between 1 and 100!"

# Make guesses
result = client.step(GuessAction(guess=50))
print(f"Hint: {result.observation.hint}")
print(f"Reward: {result.reward}")
print(f"Attempts left: {result.observation.attempts_remaining}")

# Continue until done
while not result.done:
    # Your agent logic here
    guess = 75  # Example
    result = client.step(GuessAction(guess=guess))
    print(f"Hint: {result.observation.hint}")

client.close()
```

### Training an Agent

```python
from envs.number_guess_env import NumberGuessEnv, GuessAction

env = NumberGuessEnv.from_docker_image("number-guess-env:latest")

for episode in range(100):
    result = env.reset()
    total_reward = 0

    # Simple binary search strategy
    low, high = 1, 100

    while not result.done:
        guess = (low + high) // 2
        result = env.step(GuessAction(guess=guess))
        total_reward += result.reward

        if result.observation.hint == "higher":
            low = guess + 1
        elif result.observation.hint == "lower":
            high = guess - 1

    print(f"Episode {episode}: Total reward = {total_reward}")

env.close()
```

## Building and Running

### Build Docker Image

```bash
docker build -t number-guess-env:latest server/
```

### Run Server Locally

```bash
# Using uvicorn directly
cd server
uvicorn app:app --host 0.0.0.0 --port 8000

# Or using Docker
docker run -p 8000:8000 number-guess-env:latest
```

### Test the Server

```bash
# Reset
curl -X POST http://localhost:8000/reset

# Step
curl -X POST http://localhost:8000/step \
  -H "Content-Type: application/json" \
  -d '{"guess": 50}'

# Get state
curl http://localhost:8000/state
```

## Environment Customization

You can customize the environment parameters:

```python
from envs.number_guess_env.server.number_guess_environment import NumberGuessEnvironment

# Custom range and attempts
env = NumberGuessEnvironment(
    max_attempts=15,
    min_number=1,
    max_number=1000
)
```

## API Endpoints

When running as a server, the following endpoints are available:

- `POST /reset` - Start a new game with a new secret number
- `POST /step` - Submit a guess and receive a hint
- `GET /state` - Get current episode state (episode_id, step_count)
- `GET /health` - Health check endpoint
- `GET /` - API documentation

## License

BSD 3-Clause License (see LICENSE file)