Post
3088
Excited to share our Unified Multimodal Models new work Reconstruction Alignment (RecA)! π Just 6 Γ 80GB A100s Γ 4.5 hours to boost BAGEL performance across all tasks! Outperforms FLUX-Kontext in image editing capabilities!
π Paper: https://alphaxiv.org/abs/2509.07295
π» Code: https://github.com/HorizonWind2004/reconstruction-alignment
π€ HF Models: sanaka87/reca-68ad2176380355a3dcedc068
βοΈ DEMO: sanaka87/BAGEL-RecA
π Project Page: https://reconstruction-alignment.github.io
π₯ X: https://x.com/XDWang101/status/1965908302581420204
π° Zhihu: https://zhuanlan.zhihu.com/p/1947584568187159814
π€ HF Daily Paper: Reconstruction Alignment Improves Unified Multimodal Models (2509.07295)
β‘ <10k images & 27 GPU hours (no-arch-changes) β SOTA, surpassing much larger open-source & private models:
π GenEval: 0.73 β 0.90 | π DPGBench: 80.93 β 88.15
πΌοΈ ImgEdit: 3.38 β 3.75 | ποΈ GEdit: 6.94 β 7.25
β RecA trains UMMs to reconstruct images from their own visual understanding encoder embeddings β big gains in image generation π¨ & editing βοΈ.
π Paper: https://alphaxiv.org/abs/2509.07295
π» Code: https://github.com/HorizonWind2004/reconstruction-alignment
π€ HF Models: sanaka87/reca-68ad2176380355a3dcedc068
βοΈ DEMO: sanaka87/BAGEL-RecA
π Project Page: https://reconstruction-alignment.github.io
π₯ X: https://x.com/XDWang101/status/1965908302581420204
π° Zhihu: https://zhuanlan.zhihu.com/p/1947584568187159814
π€ HF Daily Paper: Reconstruction Alignment Improves Unified Multimodal Models (2509.07295)
β‘ <10k images & 27 GPU hours (no-arch-changes) β SOTA, surpassing much larger open-source & private models:
π GenEval: 0.73 β 0.90 | π DPGBench: 80.93 β 88.15
πΌοΈ ImgEdit: 3.38 β 3.75 | ποΈ GEdit: 6.94 β 7.25
β RecA trains UMMs to reconstruct images from their own visual understanding encoder embeddings β big gains in image generation π¨ & editing βοΈ.