Spaces:
Paused
Paused
| # Docker Container Fixes Summary | |
| ## Issues Identified | |
| 1. **Database Connection Error**: `sqlite3.OperationalError: unable to open database file` | |
| 2. **OCR Model Loading Error**: Incompatible model `microsoft/trocr-base-handwritten` | |
| 3. **Container Startup Failure**: Database initialization during module import | |
| ## Fixes Applied | |
| ### 1. Database Service Improvements | |
| **File**: `app/services/database_service.py` | |
| **Changes**: | |
| - Removed automatic database initialization during import | |
| - Added explicit `initialize()` method that must be called | |
| - Improved directory creation with proper permissions (777) | |
| - Added fallback to current directory if `/app/data` fails | |
| - Added environment variable support for database path | |
| **Key Changes**: | |
| ```python | |
| def __init__(self, db_path: str = None): | |
| # Use environment variable or default path | |
| if db_path is None: | |
| db_path = os.getenv('DATABASE_PATH', '/app/data/legal_dashboard.db') | |
| self.db_path = db_path | |
| self.connection = None | |
| # Ensure data directory exists with proper permissions | |
| self._ensure_data_directory() | |
| # Don't initialize immediately - let it be called explicitly | |
| logger.info(f"Database manager initialized with path: {self.db_path}") | |
| ``` | |
| ### 2. OCR Service Improvements | |
| **File**: `app/services/ocr_service.py` | |
| **Changes**: | |
| - Added multiple compatible model fallbacks | |
| - Improved error handling for model loading | |
| - Added graceful degradation to basic text extraction | |
| - Removed problematic model `microsoft/trocr-base-handwritten` | |
| **Compatible Models**: | |
| 1. `microsoft/trocr-base-stage1` | |
| 2. `microsoft/trocr-base-handwritten` | |
| 3. `microsoft/trocr-small-stage1` | |
| 4. `microsoft/trocr-small-handwritten` | |
| ### 3. Docker Configuration Improvements | |
| **File**: `Dockerfile` | |
| **Changes**: | |
| - Added `curl` for health checks | |
| - Added environment variable for database path | |
| - Added startup script for proper initialization | |
| - Ensured proper permissions on data directory | |
| **Key Additions**: | |
| ```dockerfile | |
| ENV DATABASE_PATH=/app/data/legal_dashboard.db | |
| RUN chmod +x start.sh | |
| CMD ["./start.sh"] | |
| ``` | |
| ### 4. Startup Script | |
| **File**: `start.sh` | |
| **Purpose**: Ensures proper directory creation and permissions before starting the application | |
| ```bash | |
| #!/bin/bash | |
| # Create data and cache directories if they don't exist | |
| mkdir -p /app/data /app/cache | |
| # Set proper permissions | |
| chmod -R 777 /app/data /app/cache | |
| # Start the application | |
| exec uvicorn app.main:app --host 0.0.0.0 --port 7860 | |
| ``` | |
| ### 5. Docker Compose Configuration | |
| **File**: `docker-compose.yml` | |
| **Changes**: | |
| - Added proper volume mounts for data persistence | |
| - Added environment variables | |
| - Added health check configuration | |
| - Improved service naming | |
| ### 6. Debug and Testing Tools | |
| **Files Created**: | |
| - `debug_container.py` - Tests container environment | |
| - `test_db_connection.py` - Tests database connectivity | |
| - `rebuild_and_test.sh` - Automated rebuild script (Linux/Mac) | |
| - `rebuild_and_test.ps1` - Automated rebuild script (Windows) | |
| ### 7. Documentation | |
| **File**: `DEPLOYMENT_GUIDE.md` | |
| **Content**: | |
| - Comprehensive troubleshooting guide | |
| - Step-by-step deployment instructions | |
| - Common issues and solutions | |
| - Environment variable documentation | |
| ## Testing the Fixes | |
| ### Quick Test Commands | |
| 1. **Test Database Connection**: | |
| ```bash | |
| docker run --rm legal-dashboard-ocr python debug_container.py | |
| ``` | |
| 2. **Rebuild and Test** (Windows): | |
| ```powershell | |
| .\rebuild_and_test.ps1 | |
| ``` | |
| 3. **Rebuild and Test** (Linux/Mac): | |
| ```bash | |
| ./rebuild_and_test.sh | |
| ``` | |
| 4. **Manual Docker Compose**: | |
| ```bash | |
| docker-compose up --build | |
| ``` | |
| ## Expected Results | |
| After applying these fixes: | |
| 1. β **Container starts successfully** without database errors | |
| 2. β **OCR models load properly** with fallback support | |
| 3. β **Database is accessible** and persistent across restarts | |
| 4. β **Health endpoint responds** correctly | |
| 5. β **Application is accessible** at `http://localhost:7860` | |
| ## Environment Variables | |
| | Variable | Default | Purpose | | |
| |----------|---------|---------| | |
| | `DATABASE_PATH` | `/app/data/legal_dashboard.db` | SQLite database location | | |
| | `TRANSFORMERS_CACHE` | `/app/cache` | Hugging Face model cache | | |
| | `HF_HOME` | `/app/cache` | Hugging Face home directory | | |
| | `HF_TOKEN` | (not set) | Hugging Face authentication | | |
| ## Volume Mounts | |
| - `./data:/app/data` - Database and uploaded files | |
| - `./cache:/app/cache` - Hugging Face model cache | |
| ## Next Steps | |
| 1. **Test the application** using the provided scripts | |
| 2. **Monitor logs** for any remaining issues | |
| 3. **Deploy to production** if testing is successful | |
| 4. **Add authentication** for production use | |
| 5. **Implement monitoring** for long-term stability | |
| ## Support | |
| If issues persist: | |
| 1. Check container logs: `docker logs <container_name>` | |
| 2. Run debug script: `docker exec -it <container> python debug_container.py` | |
| 3. Verify Docker resources (memory, disk space) | |
| 4. Check network connectivity for model downloads |