Cloud-Agents / README.md
Mentors4EDU's picture
Upload 14 files
f2bab5e verified
|
raw
history blame
1.16 kB

Cloud Agents for Distributed Model Training

A lightweight and horizontally scalable distributed computing system for training large language models, specifically designed for OpenPeerLLM.

Features

  • Distributed tensor operations for model training
  • CouchDB-based coordination layer
  • Automatic agent discovery and load balancing
  • Horizontal scaling capabilities
  • Fault tolerance and recovery
  • Integration with OpenPeerAI's OpenPeerLLM

Installation

pip install -r requirements.txt

Configuration

  1. Set up CouchDB instance
  2. Copy .env.example to .env and configure your settings
  3. Start the coordinator node
  4. Launch agent nodes

Quick Start

# Start coordinator
python -m cloud_agents.coordinator

# Start agent (on each machine)
python -m cloud_agents.agent

Architecture

  • coordinator: Manages job distribution and agent coordination
  • agent: Handles tensor operations and model training
  • couchdb_client: Interface for CouchDB communication
  • tensor_ops: Distributed tensor operations
  • utils: Helper functions and utilities

License

MIT