Upload folder using huggingface_hub

Browse files

Files changed (7) hide show

.gitattributes +1 -0
README.md +85 -3
assets/teaser.webp +3 -0
assets/uso.webp +0 -0
config.json +4 -0
uso_flux_v1.0/dit_lora.safetensors +3 -0
uso_flux_v1.0/projector.safetensors +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+assets/teaser.webp filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,3 +1,85 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+language:
+- en
+base_model:
+- black-forest-labs/FLUX.1-dev
+library_name: transformers
+pipeline_tag: image-to-image
+tags:
+- image-generation
+- subject-personalization
+- style-transfer
+- Diffusion-Transformer
+---
+<h3 align="center">
+    <img src="assets/uso.webp" alt="Logo" style="vertical-align: middle; width: 95px; height: auto;">
+    </br>
+    Unified Style and Subject-Driven Generation via Disentangled and Reward Learning
+</h3>
+<p align="center">
+<a href="https://github.com/bytedance/USO"><img alt="Build" src="https://img.shields.io/github/stars/bytedance/USO"></a>
+<a href="https://bytedance.github.io/USO/"><img alt="Build" src="https://img.shields.io/badge/Project%20Page-USO-blue"></a>
+<a href="https://arxiv.org/abs/2508.18966"><img alt="Build" src="https://img.shields.io/badge/Tech%20Report-USO-b31b1b.svg"></a>
+<a href="https://huggingface.co/bytedance-research/USO"><img src="https://img.shields.io/static/v1?label=%F0%9F%A4%97%20Hugging%20Face&message=Model&color=green"></a>
+</p>
+![teaser of USO](./assets/teaser.webp)
+## 📖 Introduction
+Existing literature typically treats style-driven and subject-driven generation as two disjoint tasks: the former prioritizes stylistic similarity, whereas the latter insists on subject consistency, resulting in an apparent antagonism. We argue that both objectives can be unified under a single framework because they ultimately concern the disentanglement and re-composition of “content” and “style”, a long-standing theme in style-driven research. To this end, we present USO, a Unified framework for Style driven and subject-driven GeneratiOn. First, we construct a large-scale triplet dataset consisting of content images, style images, and their corresponding stylized content images. Second, we introduce a disentangled learning scheme that simultaneously aligns style features and disentangles content from style through two complementary objectives, style-alignment training and content–style disentanglement training. Third, we incorporate a style reward-learning paradigm to further enhance the model’s performance.
+## ⚡️ Quick Start
+### 🔧 Requirements and Installation
+Clone our [Github repo](https://github.com/bytedance/UNO)
+Install the requirements
+```bash
+## create a virtual environment with python >= 3.10 <= 3.12, like
+# python -m venv uso_env
+# source uso_env/bin/activate
+# then install
+pip install -r requirements.txt
+```
+then download checkpoints in one of the three ways:
+1. Directly run the inference scripts, the checkpoints will be downloaded automatically by the `hf_hub_download` function in the code to your `$HF_HOME`(the default value is `~/.cache/huggingface`).
+2. use `huggingface-cli download <repo name>` to download `black-forest-labs/FLUX.1-dev`, `xlabs-ai/xflux_text_encoders`, `openai/clip-vit-large-patch14`, `TODO UNO hf model`, then run the inference scripts.
+3. use `huggingface-cli download <repo name> --local-dir <LOCAL_DIR>` to download all the checkpoints menthioned in 2. to the directories your want. Then set the environment variable `TODO`. Finally, run the inference scripts.
+### 🌟 Gradio Demo
+```bash
+python app.py
+```
+## 📄 Disclaimer
+<p>
+  We open-source this project for academic research. The vast majority of images
+  used in this project are either generated or from open-source datasets. If you have any concerns,
+  please contact us, and we will promptly remove any inappropriate content.
+  Our project is released under the Apache 2.0 License. If you apply to other base models,
+  please ensure that you comply with the original licensing terms.
+  <br><br>This research aims to advance the field of generative AI. Users are free to
+  create images using this tool, provided they comply with local laws and exercise
+  responsible usage. The developers are not liable for any misuse of the tool by users.</p>
+##  Citation
+We also appreciate it if you could give a star ⭐ to our [Github repository](https://github.com/bytedance/USO). Thanks a lot!
+If you find this project useful for your research, please consider citing our paper:
+```bibtex
+@article{wu2025uso,
+    title={USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning},
+    author={Shaojin Wu and Mengqi Huang and Yufeng Cheng and Wenxu Wu and Jiahe Tian and Yiming Luo and Fei Ding and Qian He},
+    year={2025},
+    eprint={2508.18966},
+    archivePrefix={arXiv},
+    primaryClass={cs.CV},
+}
+```

assets/teaser.webp ADDED Viewed

Git LFS Details

SHA256: 543c724f6b929303046ae481672567fe4a9620f0af5ca1dfff215dc7a2cbff5f
Pointer size: 132 Bytes
Size of remote file: 1.67 MB

assets/uso.webp ADDED Viewed

config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "_diffusers_version": "0.30.1",
+  "_uso_flux_version": "1.0"
+}

uso_flux_v1.0/dit_lora.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a03fa8430997f1c371c2471b133bdc03433a50564e0a29c096217077b0309e41
+size 478187816

uso_flux_v1.0/projector.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9a0dfcd6644e3acaf6995625562ab0af1f9cf048bf739c7e5822ee106fb44311
+size 21548200