nielsr HF Staff commited on
Commit
2aac224
·
verified ·
1 Parent(s): 310c4df

Add comprehensive model card for SV-DRR

Browse files

This PR adds a comprehensive model card for the SV-DRR model.

It includes:
- Linking to the paper: [SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model](https://huggingface.co/papers/2507.05148)
- Adding `license: apache-2.0` based on common practice for research code.
- Specifying `library_name: diffusers` as evidenced by file information (`_diffusers_version`) and GitHub content, enabling the "how to use" widget.
- Setting `pipeline_tag: image-to-image` for better discoverability.
- Providing a direct link to the GitHub repository.
- Including sample usage code snippets from the GitHub README.

Please review and merge this PR if everything looks good.

Files changed (1) hide show
  1. README.md +129 -0
README.md ADDED
@@ -0,0 +1,129 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: diffusers
4
+ pipeline_tag: image-to-image
5
+ ---
6
+
7
+ # SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model
8
+
9
+ <p align="center">
10
+ <strong>MICCAI 2025</strong><br>
11
+ <img src="https://conferences.miccai.org/2025/files/images/layout/en/miccai2025-mobile-logo.png" alt="MICCAI 2025" height="80">
12
+ </p>
13
+
14
+ This repository contains the model presented in the paper [SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model](https://huggingface.co/papers/2507.05148).
15
+
16
+ **Authors:** Chun Xie, Yuichi Yoshii, Itaru Kitahara
17
+ *University of Tsukuba | Tokyo Medical University Ibaraki Medical Center*
18
+
19
+ ## Abstract
20
+
21
+ X-ray imaging is a rapid and cost-effective tool for visualizing internal human anatomy. While multi-view X-ray imaging provides complementary information that enhances diagnosis, intervention, and education, acquiring images from multiple angles increases radiation exposure and complicates clinical workflows. To address these challenges, we propose a novel view-conditioned diffusion model for synthesizing multi-view X-ray images from a single view. Unlike prior methods, which are limited in angular range, resolution, and image quality, our approach leverages the Diffusion Transformer to preserve fine details and employs a weak-to-strong training strategy for stable high-resolution image generation. Experimental results demonstrate that our method generates higher-resolution outputs with improved control over viewing angles. This capability has significant implications not only for clinical applications but also for medical education and data extension, enabling the creation of diverse, high-quality datasets for training and analysis. Our code is available at this https URL .
22
+
23
+ ## TL;DR
24
+
25
+ We propose a novel view-conditioned diffusion model for synthesizing
26
+ multi-view X-ray images up to 1024x1024 resolution from a single view.
27
+
28
+ <p align="center">
29
+ <img src="https://github.com/xiechun298/SV-DRR/raw/main/assets/demo2.gif" alt="demo2.gif" width="500"/>
30
+ </p>
31
+
32
+ ## Visual Comparison with SOTA Methods
33
+ ![visulization](https://github.com/xiechun298/SV-DRR/raw/main/assets/visulization.svg)
34
+
35
+ ## DRR vs. SV-DRR
36
+ The name SV-DRR, short for Single-View DRR, is inspired by Digitally Reconstructed Radiography (DRR).
37
+
38
+ Unlike DRR, which renders X-ray projections from a 3D CT volume, our method synthesizes novel views directly from a single 2D projection.
39
+
40
+ ![SV_DRR](https://github.com/xiechun298/SV-DRR/raw/main/assets/SV_DRR.svg)
41
+
42
+ ## Code
43
+ The official code and further details can be found on the [GitHub repository](https://github.com/xiechun298/SV-DRR).
44
+
45
+ ## Usage
46
+
47
+ ### 🚀 Quick Start
48
+
49
+ #### 🛠️ Environment Setup
50
+
51
+ To ensure compatibility and reproducibility, follow these steps to set up the environment:
52
+
53
+ 1. **Clone the Repository**:
54
+ ```bash
55
+ git clone https://github.com/xiechun-tsukuba/svdrr.git
56
+ cd svdrr
57
+ ```
58
+
59
+ 2. **Create a Python Virtual Environment**:
60
+ ```bash
61
+ conda create -f environment.yaml
62
+ ```
63
+
64
+ #### ⏬ Download Pretrained Models
65
+
66
+ You can download the pretrained models by either:
67
+
68
+ **Option 1: Automated Download (Recommended)**
69
+ ```bash
70
+ python scripts/download_models.py
71
+ ```
72
+ This will download all models into the `models/` directory. Shared components will be stored in the `shared/` folder, and symbolic links will be created in each model folder accordingly.
73
+
74
+ **Option 2: Manual Download from Hugging Face**
75
+ - 256 resolution: https://huggingface.co/xiechun-tsukuba/svdrr-dit-fb-256
76
+ - 512 resolution: https://huggingface.co/xiechun-tsukuba/svdrr-dit-fb-512
77
+ - 1024 resolution: https://huggingface.co/xiechun-tsukuba/svdrr-dit-fb-1024
78
+
79
+ ### 🔍 Inference
80
+
81
+ **Important Note:** The coordinate system of LIDC-IDRI-DRR is opposite to the intuitive one — the polar angle increases downward, and the azimuth angle increases when rotating to the left. To invert the pose coordinate system, use the `--flip_pose` option.
82
+
83
+ #### Single Image Inference
84
+
85
+ **Default views (azimuth angles from -90° to 90° in 5° increments):**
86
+ ```bash
87
+ python test_svdrr_DiT.py --model_path models/DiT-fb-512 \
88
+ --image_path demo/real_xray.jpg \
89
+ --log_dir outputs/ \
90
+ --image_size 512 \
91
+ --simple_pose
92
+ ```
93
+
94
+ **User-specified views defined in camera_views.json:**
95
+ ```bash
96
+ python test_svdrr_DiT.py --model_path models/DiT-fb-512 \
97
+ --image_path demo/real_xray.jpg \
98
+ --log_dir outputs/ \
99
+ --image_size 512 \
100
+ --poses demo/camera_views.json
101
+ ```
102
+
103
+ For more detailed usage, including dataset inference and training, please refer to the [GitHub repository](https://github.com/xiechun298/SV-DRR).
104
+
105
+ ## Citation
106
+ If you find this work useful, a citation will be appreciated via:
107
+
108
+ ```bibtex
109
+ @InProceedings{XieChu_SVDRR_MICCAI2025,
110
+ author = { Xie, Chun AND Yoshii, Yuichi AND Kitahara, Itaru},
111
+ title = { { SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model } },
112
+ booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
113
+ year = {2025},
114
+ publisher = {Springer Nature Switzerland},\
115
+ volume = {LNCS 15963},\
116
+ month = {September},\
117
+ page = {572 -- 582},\
118
+ doi = {https://doi.org/10.1007/978-3-032-04965-0_54}
119
+ }
120
+
121
+ @misc{xie2025svdrr,
122
+ title = {SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model},
123
+ author = {Chun Xie and Yuichi Yoshii and Itaru Kitahara},\
124
+ year = {2025},\
125
+ eprint = {2507.05148},\
126
+ archivePrefix = {arXiv},\
127
+ doi = {https://doi.org/10.48550/arXiv.2507.05148},
128
+ }
129
+ ```