# Cloud Agents for Distributed Model Training

A lightweight and horizontally scalable distributed computing system for training large language models, specifically designed for OpenPeerLLM.

## Features

- Distributed tensor operations for model training
- CouchDB-based coordination layer
- Automatic agent discovery and load balancing
- Horizontal scaling capabilities
- Fault tolerance and recovery
- Integration with OpenPeerAI's OpenPeerLLM

## Installation

```bash
pip install -r requirements.txt
```

## Configuration

1. Set up CouchDB instance
2. Copy `.env.example` to `.env` and configure your settings
3. Start the coordinator node
4. Launch agent nodes

## Quick Start

```bash
# Start coordinator
python -m cloud_agents.coordinator

# Start agent (on each machine)
python -m cloud_agents.agent
```

## Architecture

- `coordinator`: Manages job distribution and agent coordination
- `agent`: Handles tensor operations and model training
- `couchdb_client`: Interface for CouchDB communication
- `tensor_ops`: Distributed tensor operations
- `utils`: Helper functions and utilities

## License

MIT