ivnle commited on
Commit
4ff3cec
·
verified ·
1 Parent(s): 620b3fb

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +56 -0
README.md ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - vision
5
+ - ocr
6
+ - compression
7
+ - autoencoding
8
+ ---
9
+
10
+ # Bad Autoencoding - Model Checkpoints
11
+
12
+ Checkpoints for the paper: **"Optical Context Compression Is Just (Bad) Autoencoding"** (ICML 2025)
13
+
14
+ ## Links
15
+
16
+ - **Paper**: [Coming soon]
17
+ - **Code**: [https://github.com/ivnle/bad-autoencoding](https://github.com/ivnle/bad-autoencoding)
18
+
19
+ ## Available Checkpoints
20
+
21
+ | Checkpoint | Objective | Training | PPL |
22
+ |------------|-----------|----------|-----|
23
+ | `vision_base_reconstruction` | Reconstruction | Direct | 1.03 |
24
+ | `vision_base_lm_direct` | Language Modeling | Direct (no recon init) | 5.08 |
25
+ | `vision_base_lm_recon_init` | Language Modeling | Initialized from reconstruction | 5.06 |
26
+
27
+ ## Model Details
28
+
29
+ - **Architecture**: DeepSeek-OCR with trainable vision encoder
30
+ - **Image Size**: 768x768 (base)
31
+ - **Encoder Status**: Trained (not frozen)
32
+ - **Dataset**: 510k samples
33
+
34
+ ## Usage
35
+
36
+ ```python
37
+ from huggingface_hub import hf_hub_download
38
+
39
+ # Download a specific checkpoint
40
+ checkpoint_path = hf_hub_download(
41
+ repo_id="ivnle/bad-autoencoding",
42
+ filename="vision_base_lm_direct/model.pt",
43
+ repo_type="model"
44
+ )
45
+ ```
46
+
47
+ ## Citation
48
+
49
+ ```bibtex
50
+ @inproceedings{bad-autoencoding2025,
51
+ title={Optical Context Compression Is Just (Bad) Autoencoding},
52
+ author={...},
53
+ booktitle={ICML},
54
+ year={2025}
55
+ }
56
+ ```