Spaces:

anushadudi
/

openenv-number-guessing-env

Sleeping

File size: 4,162 Bytes

87456db
4d0e1b9
 
 
 
87456db
 
4d0e1b9
 
 
 
 
 
1747df2
87456db
 
4d0e1b9

---
title: Number Guessing Game Environment
emoji: 🎯
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
app_port: 8000
tags:
  - openenv
  - reinforcement-learning
  - game
  - binary-search
base_path: /web
---

# Number Guessing Game Environment

A simple OpenEnv environment where an agent learns to guess a secret number between 1 and 100 with limited attempts.

## Description

The agent receives hints after each guess ("higher", "lower", or "correct") and must find the secret number within 10 attempts. This environment is perfect for:
- Teaching RL agents binary search strategies
- Learning the OpenEnv framework
- Benchmarking simple reasoning capabilities

## Environment Details

**Action Space:**
- `GuessAction` with a single field:
  - `guess`: Integer between 1 and 100

**Observation Space:**
- `GuessObservation` with fields:
  - `hint`: String ("correct", "higher", "lower", or "invalid")
  - `attempts_remaining`: Number of guesses left
  - `guess_history`: List of all previous guesses
  - `done`: Boolean indicating if episode is complete
  - `reward`: Float reward for the action

**Reward Structure:**
- `+10.0`: Correct guess (episode ends)
- `+0.1`: Valid guess that narrows the range
- `-1.0`: Invalid guess (out of bounds)
- `-5.0`: Failed to guess within max attempts

**Episode Termination:**
- Agent guesses correctly
- Agent runs out of attempts (10 by default)

## Quick Start

### Using the Client

```python
from envs.number_guess_env import NumberGuessEnv, GuessAction

# Connect to a running server
client = NumberGuessEnv(base_url="http://localhost:8000")

# Or use Docker (automatically starts container)
# client = NumberGuessEnv.from_docker_image("number-guess-env:latest")

# Start a new game
result = client.reset()
print(result.observation.hint)  # "Guess a number between 1 and 100!"

# Make guesses
result = client.step(GuessAction(guess=50))
print(f"Hint: {result.observation.hint}")
print(f"Reward: {result.reward}")
print(f"Attempts left: {result.observation.attempts_remaining}")

# Continue until done
while not result.done:
    # Your agent logic here
    guess = 75  # Example
    result = client.step(GuessAction(guess=guess))
    print(f"Hint: {result.observation.hint}")

client.close()
```

### Training an Agent

```python
from envs.number_guess_env import NumberGuessEnv, GuessAction

env = NumberGuessEnv.from_docker_image("number-guess-env:latest")

for episode in range(100):
    result = env.reset()
    total_reward = 0

    # Simple binary search strategy
    low, high = 1, 100

    while not result.done:
        guess = (low + high) // 2
        result = env.step(GuessAction(guess=guess))
        total_reward += result.reward

        if result.observation.hint == "higher":
            low = guess + 1
        elif result.observation.hint == "lower":
            high = guess - 1

    print(f"Episode {episode}: Total reward = {total_reward}")

env.close()
```

## Building and Running

### Build Docker Image

```bash
docker build -t number-guess-env:latest server/
```

### Run Server Locally

```bash
# Using uvicorn directly
cd server
uvicorn app:app --host 0.0.0.0 --port 8000

# Or using Docker
docker run -p 8000:8000 number-guess-env:latest
```

### Test the Server

```bash
# Reset
curl -X POST http://localhost:8000/reset

# Step
curl -X POST http://localhost:8000/step \
  -H "Content-Type: application/json" \
  -d '{"guess": 50}'

# Get state
curl http://localhost:8000/state
```

## Environment Customization

You can customize the environment parameters:

```python
from envs.number_guess_env.server.number_guess_environment import NumberGuessEnvironment

# Custom range and attempts
env = NumberGuessEnvironment(
    max_attempts=15,
    min_number=1,
    max_number=1000
)
```

## API Endpoints

When running as a server, the following endpoints are available:

- `POST /reset` - Start a new game with a new secret number
- `POST /step` - Submit a guess and receive a hint
- `GET /state` - Get current episode state (episode_id, step_count)
- `GET /health` - Health check endpoint
- `GET /` - API documentation

## License

BSD 3-Clause License (see LICENSE file)