File size: 2,471 Bytes
55d584b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
# Warbler CDA HuggingFace Deployment

This directory contains the Warbler CDA package prepared for HuggingFace deployment.

## Quick Start

### Local Testing

```bash
# Install dependencies
pip install -r requirements.txt

# Install package in development mode
pip install -e .

# Run Gradio demo
python app.py
```

### Deploy to HuggingFace Space

#### Option 1: Manual Deployment

```bash
# Install HuggingFace CLI
pip install huggingface_hub

# Login
huggingface-cli login

# Upload to Space
huggingface-cli upload YOUR_USERNAME/warbler-cda . --repo-type=space
```

#### Option 2: GitLab CI/CD (Automated)

1. Set up HuggingFace token in GitLab CI/CD variables:
   - Go to Settings > CI/CD > Variables
   - Add variable `HF_TOKEN` with your HuggingFace token
   - Add variable `HF_SPACE_NAME` with your Space name (e.g., `username/warbler-cda`)

2. Push to main branch or create a tag:

   ```bash
   git tag v0.1.0
   git push origin v0.1.0
   ```

3. The pipeline will automatically sync to HuggingFace!

## Package Structure

```none
warbler-cda-package/
β”œβ”€β”€ warbler_cda/              # Main package
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ retrieval_api.py      # Core RAG API
β”‚   β”œβ”€β”€ semantic_anchors.py   # Semantic memory
β”‚   β”œβ”€β”€ stat7_rag_bridge.py   # STAT7 hybrid scoring
β”‚   β”œβ”€β”€ embeddings/           # Embedding providers
β”‚   β”œβ”€β”€ api/                  # FastAPI service
β”‚   └── utils/                # Utilities
β”œβ”€β”€ app.py                    # Gradio demo for HF Space
β”œβ”€β”€ requirements.txt          # Dependencies
β”œβ”€β”€ pyproject.toml            # Package metadata
β”œβ”€β”€ README.md                 # Documentation
└── LICENSE                   # MIT License
```

## Features

- **Semantic Search**: Natural language document retrieval
- **STAT7 Addressing**: 7-dimensional multi-modal scoring
- **Hybrid Scoring**: Combines semantic + STAT7 for superior results
- **Production API**: FastAPI service with concurrent query support
- **CLI Tools**: Command-line interface for management
- **HF Integration**: Direct dataset ingestion

## Testing

```bash
# Run tests
pytest

# Run specific experiments
python -m warbler_cda.stat7_experiments
```

## Documentation

See [README.md](README.md) for full documentation.

## Support

- **Issues**: <https://gitlab.com/tiny-walnut-games/the-seed/-/issues>
- **Discussions**: <https://gitlab.com/tiny-walnut-games/the-seed/-/merge_requests>