Utilize HF's balanced device_map + move diffusion components in relevant execution cores 098bca5 verified diopside commited on Sep 19