# Cloud Agents for Distributed Model Training A lightweight and horizontally scalable distributed computing system for training large language models, specifically designed for OpenPeerLLM. ## Features - Distributed tensor operations for model training - CouchDB-based coordination layer - Automatic agent discovery and load balancing - Horizontal scaling capabilities - Fault tolerance and recovery - Integration with OpenPeerAI's OpenPeerLLM ## Installation ```bash pip install -r requirements.txt ``` ## Configuration 1. Set up CouchDB instance 2. Copy `.env.example` to `.env` and configure your settings 3. Start the coordinator node 4. Launch agent nodes ## Quick Start ```bash # Start coordinator python -m cloud_agents.coordinator # Start agent (on each machine) python -m cloud_agents.agent ``` ## Architecture - `coordinator`: Manages job distribution and agent coordination - `agent`: Handles tensor operations and model training - `couchdb_client`: Interface for CouchDB communication - `tensor_ops`: Distributed tensor operations - `utils`: Helper functions and utilities ## License MIT