[2025-08-14 18:05:08] Experiment directory created at ./outputs/audio_video/000-Wan2_1_T2V_1_3B [2025-08-14 18:05:08] Training configuration: {'adam_eps': 1e-15, 'aes': None, 'audio_cfg': {'augmentation': {'mixup': 0.0}, 'preprocessing': {'audio': {'duration': 10.24, 'max_wav_value': 32768.0, 'sampling_rate': 16000, 'scale_factor': 8}, 'mel': {'mel_fmax': 8000, 'mel_fmin': 0, 'n_mel_channels': 64}, 'stft': {'filter_length': 1024, 'hop_length': 160, 'win_length': 1024}}}, 'audio_vae': {'from_pretrained': './checkpoints/audioldm2', 'type': 'AudioLDM2'}, 'audio_weight_path': 'exps/audio/dual_ffn_no_attnlora/epoch017-global_step75000', 'bucket_config': {'240p': {33: ((1.0, 1.0), 16), 49: ((1.0, 0.4), 12), 65: ((1.0, 0.3), 12), 81: ((1.0, 0.2), 10)}, '360p': {33: ((0.5, 0.5), 8), 49: ((0.5, 0.3), 6), 65: ((0.5, 0.2), 6), 81: ((0.5, 0.2), 5)}, '480p': {33: ((0.5, 0.3), 5), 49: ((1.0, 0.2), 4), 65: ((1.0, 0.2), 4), 81: ((1.0, 0.1), 3)}}, 'ckpt_every': 250, 'config': 'configs/wan2.1/train/stage2_audio_video.py', 'dataset': {'audio_transform_name': 'mel_spec_audioldm2', 'data_path': 'debug/meta/TAVGBench_train_140k.csv', 'default_video_fps': 16, 'direct_load_video_cli': True, 'scale_factor': 16, 'transform_name': 'resize_crop', 'type': 'VariableVideoAudioTextDataset', 'use_audio_in_video': True}, 'dtype': 'bf16', 'ema_decay': 0.99, 'epochs': 10, 'flow': None, 'grad_checkpoint': True, 'grad_clip': 1.0, 'load': None, 'load_text_features': False, 'log_every': 10, 'lora_alpha': 256, 'lora_dropout': 0, 'lora_enabled': True, 'lora_r': 128, 'lora_target_modules': ['self_attn.q', 'self_attn.k', 'self_attn.v', 'self_attn.o', 'cross_attn.q', 'cross_attn.k', 'cross_attn.v', 'cross_attn.o'], 'lr': 0.0001, 'mel_bins': 64, 'model': {'audio_in_dim': 8, 'audio_out_dim': 8, 'audio_patch_size': (2, 2), 'audio_special_token': False, 'class_drop_prob': 0.1, 'cross_attn_norm': True, 'dim': 1536, 'dual_ffn': True, 'ffn_dim': 8960, 'freq_dim': 256, 'init_from_video_branch': False, 'model_type': 't2av', 'num_heads': 12, 'num_layers': 30, 'patch_size': (1, 2, 2), 'qk_norm': True, 'train_audio_specific_blocks': False, 'type': 'Wan2_1_T2V_1_3B', 'weight_init_from': ['./checkpoints/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors', 'exps/audio/dual_ffn_no_attnlora/epoch017-global_step75000'], 'window_size': (-1, -1)}, 'neg_prompt': '色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走,低音质,差音质,最差音质,噪音,失真的,破音,削波失真,数字瑕疵,声音故障,不自然的,刺耳的,尖锐的,底噪,过多混响,过多回声,突兀的剪辑,不自然的淡出,录音质量差,业余录音', 'num_bucket_build_workers': 8, 'num_workers': 16, 'outputs': './outputs/audio_video', 'plugin': 'zero2', 'port': 29500, 'record_time': False, 'sampling_rate': 16000, 'save_total_limit': 2, 'scheduler': {'num_sampling_steps': 50, 'transform_scale': 5.0, 'type': 'rflow', 'use_timestep_transform': True}, 'seed': 42, 'start_from_scratch': False, 'text_encoder': {'from_pretrained': './checkpoints/Wan2.1-T2V-1.3B', 't5_checkpoint': 'models_t5_umt5-xxl-enc-bf16.pth', 't5_tokenizer': 'google/umt5-xxl', 'text_len': 512, 'type': 'Wan2_1_T2V_1_3B_t5_umt5'}, 'vae': {'from_pretrained': './checkpoints/Wan2.1-T2V-1.3B', 'type': 'Wan2_1_T2V_1_3B_VAE', 'vae_checkpoint': 'Wan2.1_VAE.pth', 'vae_stride': (4, 8, 8)}, 'video_weight_path': './checkpoints/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors', 'wandb': False, 'warmup_steps': 1000} [2025-08-14 18:05:08] Building dataset... [2025-08-14 18:05:10] Dataset contains 140048 samples. [2025-08-14 18:05:10] Number of buckets: 204 [2025-08-14 18:05:10] Building buckets... [2025-08-14 18:05:12] Bucket Info: [2025-08-14 18:05:12] Bucket [#sample, #batch] by aspect ratio: {'0.38': [73, 6], '0.43': [269, 38], '0.48': [48, 2], '0.50': [82, 7], '0.53': [165, 21], '0.54': [578, 72], '0.56': [94859, 16895], '0.62': [844, 129], '0.67': [2354, 317], '0.75': [34023, 3483], '1.00': [303, 27], '1.33': [268, 22], '1.50': [76, 5], '1.78': [870, 90]} [2025-08-14 18:05:12] Image Bucket [#sample, #batch] by HxWxT: {} [2025-08-14 18:05:12] Video Bucket [#sample, #batch] by HxWxT: {('480p', 81): [8932, 2972], ('480p', 65): [16251, 4058], ('480p', 49): [13020, 3250], ('480p', 33): [7810, 1557], ('360p', 81): [6942, 1384], ('360p', 65): [5609, 930], ('360p', 49): [6592, 1093], ('360p', 33): [7835, 973], ('240p', 81): [12433, 1237], ('240p', 65): [14710, 1219], ('240p', 49): [13809, 1145], ('240p', 33): [20869, 1296]} [2025-08-14 18:05:12] #training batch: 20.62 K, #training sample: 131.65 K, #non empty bucket: 164 [2025-08-14 18:05:12] Building models... [2025-08-14 18:05:12] loading ./checkpoints/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth [2025-08-14 18:05:20] loading ./checkpoints/Wan2.1-T2V-1.3B/Wan2.1_VAE.pth [2025-08-14 18:05:25] AudioLDM2 text free. [2025-08-14 18:05:52] 825/982 keys loaded from ./checkpoints/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors. [2025-08-14 18:05:55] Model checkpoint loaded from exps/audio/dual_ffn_no_attnlora/epoch017-global_step75000 [2025-08-14 18:05:58] Trainable model params: 90.00 M, Total model params: 2.19 B [2025-08-14 18:05:58] Preparing for distributed training... [2025-08-14 18:05:58] Boosting model for distributed training [2025-08-14 18:05:58] Training for 10 epochs with 10557 steps per epoch [2025-08-14 18:05:59] Using neg_prompt for classifier-free gudiance training: 色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走,低音质,差音质,最差音质,噪音,失真的,破音,削波失真,数字瑕疵,声音故障,不自然的,刺耳的,尖锐的,底噪,过多混响,过多回声,突兀的剪辑,不自然的淡出,录音质量差,业余录音 [2025-08-14 18:06:02] Beginning epoch 0... [2025-08-14 18:09:27] {'loss': '1.0546', 'loss_video': '0.2240', 'loss_audio': '0.8306', 'step': 9, 'global_step': 9} [2025-08-14 18:11:43] {'loss': '1.1174', 'loss_video': '0.2757', 'loss_audio': '0.8417', 'step': 19, 'global_step': 19} [2025-08-14 18:14:26] {'loss': '1.1779', 'loss_video': '0.3109', 'loss_audio': '0.8670', 'step': 29, 'global_step': 29} [2025-08-14 18:16:47] {'loss': '1.1073', 'loss_video': '0.2771', 'loss_audio': '0.8302', 'step': 39, 'global_step': 39} [2025-08-14 18:19:11] {'loss': '1.1208', 'loss_video': '0.3269', 'loss_audio': '0.7939', 'step': 49, 'global_step': 49} [2025-08-14 18:21:33] {'loss': '1.0536', 'loss_video': '0.2927', 'loss_audio': '0.7610', 'step': 59, 'global_step': 59} [2025-08-14 18:24:00] {'loss': '1.0562', 'loss_video': '0.3088', 'loss_audio': '0.7475', 'step': 69, 'global_step': 69} [2025-08-14 18:26:31] {'loss': '0.9795', 'loss_video': '0.3118', 'loss_audio': '0.6677', 'step': 79, 'global_step': 79} [2025-08-14 18:29:14] {'loss': '0.9179', 'loss_video': '0.2840', 'loss_audio': '0.6339', 'step': 89, 'global_step': 89} [2025-08-14 18:31:50] {'loss': '0.8900', 'loss_video': '0.2756', 'loss_audio': '0.6144', 'step': 99, 'global_step': 99} [2025-08-14 18:34:02] {'loss': '0.8635', 'loss_video': '0.2717', 'loss_audio': '0.5918', 'step': 109, 'global_step': 109} [2025-08-14 18:36:20] {'loss': '0.8671', 'loss_video': '0.3080', 'loss_audio': '0.5591', 'step': 119, 'global_step': 119} [2025-08-14 18:38:38] {'loss': '0.8372', 'loss_video': '0.2754', 'loss_audio': '0.5618', 'step': 129, 'global_step': 129} [2025-08-14 18:40:45] {'loss': '0.8587', 'loss_video': '0.2757', 'loss_audio': '0.5830', 'step': 139, 'global_step': 139} [2025-08-14 18:43:16] {'loss': '0.8006', 'loss_video': '0.2462', 'loss_audio': '0.5544', 'step': 149, 'global_step': 149} [2025-08-14 18:45:37] {'loss': '0.7956', 'loss_video': '0.2658', 'loss_audio': '0.5298', 'step': 159, 'global_step': 159} [2025-08-14 18:48:09] {'loss': '0.7974', 'loss_video': '0.2911', 'loss_audio': '0.5063', 'step': 169, 'global_step': 169} [2025-08-14 18:50:21] {'loss': '0.7988', 'loss_video': '0.2503', 'loss_audio': '0.5485', 'step': 179, 'global_step': 179} [2025-08-14 18:52:42] {'loss': '0.7346', 'loss_video': '0.2508', 'loss_audio': '0.4838', 'step': 189, 'global_step': 189} [2025-08-14 18:55:13] {'loss': '0.6868', 'loss_video': '0.2371', 'loss_audio': '0.4497', 'step': 199, 'global_step': 199} [2025-08-14 18:57:37] {'loss': '0.7626', 'loss_video': '0.2654', 'loss_audio': '0.4972', 'step': 209, 'global_step': 209} [2025-08-14 18:59:57] {'loss': '0.7618', 'loss_video': '0.2578', 'loss_audio': '0.5040', 'step': 219, 'global_step': 219} [2025-08-14 19:02:32] {'loss': '0.7382', 'loss_video': '0.2649', 'loss_audio': '0.4733', 'step': 229, 'global_step': 229} [2025-08-14 19:05:05] {'loss': '0.7029', 'loss_video': '0.2713', 'loss_audio': '0.4316', 'step': 239, 'global_step': 239} [2025-08-14 19:07:38] {'loss': '0.7884', 'loss_video': '0.2782', 'loss_audio': '0.5103', 'step': 249, 'global_step': 249} [2025-08-14 19:07:44] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-14 19:08:01] Saved checkpoint at epoch 0, step 250, global_step 250 to ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step250 [2025-08-14 19:10:17] {'loss': '0.7638', 'loss_video': '0.2691', 'loss_audio': '0.4948', 'step': 259, 'global_step': 259} [2025-08-14 19:12:43] {'loss': '0.7036', 'loss_video': '0.2603', 'loss_audio': '0.4433', 'step': 269, 'global_step': 269} [2025-08-14 19:15:20] {'loss': '0.8479', 'loss_video': '0.3154', 'loss_audio': '0.5325', 'step': 279, 'global_step': 279} [2025-08-14 19:17:52] {'loss': '0.8273', 'loss_video': '0.3091', 'loss_audio': '0.5181', 'step': 289, 'global_step': 289} [2025-08-14 19:20:20] {'loss': '0.7967', 'loss_video': '0.3033', 'loss_audio': '0.4934', 'step': 299, 'global_step': 299} [2025-08-14 19:22:39] {'loss': '0.6832', 'loss_video': '0.2357', 'loss_audio': '0.4475', 'step': 309, 'global_step': 309} [2025-08-14 19:25:04] {'loss': '0.7267', 'loss_video': '0.2725', 'loss_audio': '0.4541', 'step': 319, 'global_step': 319} [2025-08-14 19:27:26] {'loss': '0.7429', 'loss_video': '0.2726', 'loss_audio': '0.4703', 'step': 329, 'global_step': 329} [2025-08-14 19:29:37] {'loss': '0.7577', 'loss_video': '0.2756', 'loss_audio': '0.4821', 'step': 339, 'global_step': 339} [2025-08-14 19:32:07] {'loss': '0.7284', 'loss_video': '0.2541', 'loss_audio': '0.4743', 'step': 349, 'global_step': 349} [2025-08-14 19:34:20] {'loss': '0.7377', 'loss_video': '0.2437', 'loss_audio': '0.4940', 'step': 359, 'global_step': 359} [2025-08-14 19:36:22] {'loss': '0.6873', 'loss_video': '0.2468', 'loss_audio': '0.4405', 'step': 369, 'global_step': 369} [2025-08-14 19:38:54] {'loss': '0.7569', 'loss_video': '0.2837', 'loss_audio': '0.4732', 'step': 379, 'global_step': 379} [2025-08-14 19:41:17] {'loss': '0.7155', 'loss_video': '0.2642', 'loss_audio': '0.4513', 'step': 389, 'global_step': 389} [2025-08-14 19:43:36] {'loss': '0.6550', 'loss_video': '0.2147', 'loss_audio': '0.4404', 'step': 399, 'global_step': 399} [2025-08-14 19:46:14] {'loss': '0.7270', 'loss_video': '0.2667', 'loss_audio': '0.4603', 'step': 409, 'global_step': 409} [2025-08-14 19:48:37] {'loss': '0.7176', 'loss_video': '0.2679', 'loss_audio': '0.4497', 'step': 419, 'global_step': 419} [2025-08-14 19:50:54] {'loss': '0.7065', 'loss_video': '0.2520', 'loss_audio': '0.4545', 'step': 429, 'global_step': 429} [2025-08-14 19:53:15] {'loss': '0.7197', 'loss_video': '0.2379', 'loss_audio': '0.4819', 'step': 439, 'global_step': 439} [2025-08-14 19:55:50] {'loss': '0.6798', 'loss_video': '0.2551', 'loss_audio': '0.4247', 'step': 449, 'global_step': 449} [2025-08-14 19:58:17] {'loss': '0.7111', 'loss_video': '0.2516', 'loss_audio': '0.4596', 'step': 459, 'global_step': 459} [2025-08-14 20:00:44] {'loss': '0.6537', 'loss_video': '0.2280', 'loss_audio': '0.4257', 'step': 469, 'global_step': 469} [2025-08-14 20:02:58] {'loss': '0.7007', 'loss_video': '0.2722', 'loss_audio': '0.4285', 'step': 479, 'global_step': 479} [2025-08-14 20:05:24] {'loss': '0.7388', 'loss_video': '0.2924', 'loss_audio': '0.4464', 'step': 489, 'global_step': 489} [2025-08-14 20:07:39] {'loss': '0.7077', 'loss_video': '0.2769', 'loss_audio': '0.4308', 'step': 499, 'global_step': 499} [2025-08-14 20:07:45] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-14 20:08:01] Saved checkpoint at epoch 0, step 500, global_step 500 to ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step500 [2025-08-14 20:10:07] {'loss': '0.7504', 'loss_video': '0.2727', 'loss_audio': '0.4777', 'step': 509, 'global_step': 509} [2025-08-14 20:12:38] {'loss': '0.7183', 'loss_video': '0.2625', 'loss_audio': '0.4559', 'step': 519, 'global_step': 519} [2025-08-14 20:15:16] {'loss': '0.7617', 'loss_video': '0.3024', 'loss_audio': '0.4593', 'step': 529, 'global_step': 529} [2025-08-14 20:17:46] {'loss': '0.7419', 'loss_video': '0.2844', 'loss_audio': '0.4575', 'step': 539, 'global_step': 539} [2025-08-14 20:20:24] {'loss': '0.6453', 'loss_video': '0.2335', 'loss_audio': '0.4118', 'step': 549, 'global_step': 549} [2025-08-14 20:23:04] {'loss': '0.7050', 'loss_video': '0.2616', 'loss_audio': '0.4434', 'step': 559, 'global_step': 559} [2025-08-14 20:25:36] {'loss': '0.7280', 'loss_video': '0.2737', 'loss_audio': '0.4544', 'step': 569, 'global_step': 569} [2025-08-14 20:28:06] {'loss': '0.7160', 'loss_video': '0.2546', 'loss_audio': '0.4615', 'step': 579, 'global_step': 579} [2025-08-14 20:30:21] {'loss': '0.6748', 'loss_video': '0.2452', 'loss_audio': '0.4296', 'step': 589, 'global_step': 589} [2025-08-14 20:32:40] {'loss': '0.7025', 'loss_video': '0.2705', 'loss_audio': '0.4319', 'step': 599, 'global_step': 599} [2025-08-14 20:35:00] {'loss': '0.7030', 'loss_video': '0.2640', 'loss_audio': '0.4390', 'step': 609, 'global_step': 609} [2025-08-14 20:37:00] {'loss': '0.7289', 'loss_video': '0.2488', 'loss_audio': '0.4801', 'step': 619, 'global_step': 619} [2025-08-14 20:39:29] {'loss': '0.6844', 'loss_video': '0.2762', 'loss_audio': '0.4083', 'step': 629, 'global_step': 629} [2025-08-14 20:42:01] {'loss': '0.6325', 'loss_video': '0.2327', 'loss_audio': '0.3997', 'step': 639, 'global_step': 639} [2025-08-14 20:44:16] {'loss': '0.7423', 'loss_video': '0.2945', 'loss_audio': '0.4478', 'step': 649, 'global_step': 649} [2025-08-14 20:46:27] {'loss': '0.6720', 'loss_video': '0.2407', 'loss_audio': '0.4313', 'step': 659, 'global_step': 659} [2025-08-14 20:48:59] {'loss': '0.7155', 'loss_video': '0.2713', 'loss_audio': '0.4443', 'step': 669, 'global_step': 669} [2025-08-14 20:51:13] {'loss': '0.7036', 'loss_video': '0.2695', 'loss_audio': '0.4340', 'step': 679, 'global_step': 679} [2025-08-14 20:53:31] {'loss': '0.6439', 'loss_video': '0.2245', 'loss_audio': '0.4193', 'step': 689, 'global_step': 689} [2025-08-14 20:55:31] {'loss': '0.6580', 'loss_video': '0.2470', 'loss_audio': '0.4110', 'step': 699, 'global_step': 699} [2025-08-14 20:57:58] {'loss': '0.6947', 'loss_video': '0.2488', 'loss_audio': '0.4459', 'step': 709, 'global_step': 709} [2025-08-14 21:00:37] {'loss': '0.7716', 'loss_video': '0.2847', 'loss_audio': '0.4868', 'step': 719, 'global_step': 719} [2025-08-14 21:03:04] {'loss': '0.7170', 'loss_video': '0.2700', 'loss_audio': '0.4470', 'step': 729, 'global_step': 729} [2025-08-14 21:05:20] {'loss': '0.6677', 'loss_video': '0.2485', 'loss_audio': '0.4192', 'step': 739, 'global_step': 739} [2025-08-14 21:07:34] {'loss': '0.7137', 'loss_video': '0.2591', 'loss_audio': '0.4546', 'step': 749, 'global_step': 749} [2025-08-14 21:07:40] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-14 21:07:56] Saved checkpoint at epoch 0, step 750, global_step 750 to ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step750 [2025-08-14 21:07:56] ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step250 has been deleted successfully as cfg.save_total_limit! [2025-08-14 21:10:26] {'loss': '0.6234', 'loss_video': '0.2171', 'loss_audio': '0.4063', 'step': 759, 'global_step': 759} [2025-08-14 21:13:04] {'loss': '0.7227', 'loss_video': '0.2313', 'loss_audio': '0.4914', 'step': 769, 'global_step': 769} [2025-08-14 21:15:47] {'loss': '0.6873', 'loss_video': '0.2519', 'loss_audio': '0.4354', 'step': 779, 'global_step': 779} [2025-08-14 21:18:00] {'loss': '0.6505', 'loss_video': '0.2313', 'loss_audio': '0.4191', 'step': 789, 'global_step': 789} [2025-08-14 21:20:50] {'loss': '0.6825', 'loss_video': '0.2661', 'loss_audio': '0.4165', 'step': 799, 'global_step': 799} [2025-08-14 21:23:35] {'loss': '0.8783', 'loss_video': '0.3279', 'loss_audio': '0.5504', 'step': 809, 'global_step': 809} [2025-08-14 21:26:20] {'loss': '0.7521', 'loss_video': '0.3023', 'loss_audio': '0.4498', 'step': 819, 'global_step': 819} [2025-08-14 21:29:06] {'loss': '0.6828', 'loss_video': '0.2441', 'loss_audio': '0.4386', 'step': 829, 'global_step': 829} [2025-08-14 21:31:13] {'loss': '0.7016', 'loss_video': '0.2543', 'loss_audio': '0.4474', 'step': 839, 'global_step': 839} [2025-08-14 21:33:37] {'loss': '0.6757', 'loss_video': '0.2513', 'loss_audio': '0.4244', 'step': 849, 'global_step': 849} [2025-08-14 21:36:20] {'loss': '0.7161', 'loss_video': '0.3049', 'loss_audio': '0.4112', 'step': 859, 'global_step': 859} [2025-08-14 21:38:56] {'loss': '0.6692', 'loss_video': '0.2366', 'loss_audio': '0.4326', 'step': 869, 'global_step': 869} [2025-08-14 21:41:17] {'loss': '0.7145', 'loss_video': '0.2359', 'loss_audio': '0.4786', 'step': 879, 'global_step': 879} [2025-08-14 21:43:47] {'loss': '0.6115', 'loss_video': '0.2107', 'loss_audio': '0.4008', 'step': 889, 'global_step': 889} [2025-08-14 21:46:19] {'loss': '0.6430', 'loss_video': '0.2352', 'loss_audio': '0.4078', 'step': 899, 'global_step': 899} [2025-08-14 21:48:56] {'loss': '0.6193', 'loss_video': '0.2350', 'loss_audio': '0.3843', 'step': 909, 'global_step': 909} [2025-08-14 21:51:22] {'loss': '0.7422', 'loss_video': '0.2856', 'loss_audio': '0.4566', 'step': 919, 'global_step': 919} [2025-08-14 21:53:51] {'loss': '0.6413', 'loss_video': '0.2201', 'loss_audio': '0.4212', 'step': 929, 'global_step': 929} [2025-08-14 21:56:27] {'loss': '0.7125', 'loss_video': '0.2704', 'loss_audio': '0.4422', 'step': 939, 'global_step': 939} [2025-08-14 21:58:47] {'loss': '0.6832', 'loss_video': '0.2533', 'loss_audio': '0.4299', 'step': 949, 'global_step': 949} [2025-08-14 22:01:10] {'loss': '0.7050', 'loss_video': '0.2307', 'loss_audio': '0.4743', 'step': 959, 'global_step': 959} [2025-08-14 22:03:46] {'loss': '0.6809', 'loss_video': '0.2534', 'loss_audio': '0.4275', 'step': 969, 'global_step': 969} [2025-08-14 22:06:26] {'loss': '0.7083', 'loss_video': '0.2684', 'loss_audio': '0.4399', 'step': 979, 'global_step': 979} [2025-08-14 22:09:06] {'loss': '0.6936', 'loss_video': '0.2683', 'loss_audio': '0.4253', 'step': 989, 'global_step': 989} [2025-08-14 22:11:14] {'loss': '0.7946', 'loss_video': '0.2771', 'loss_audio': '0.5175', 'step': 999, 'global_step': 999} [2025-08-14 22:11:20] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-14 22:11:36] Saved checkpoint at epoch 0, step 1000, global_step 1000 to ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step1000 [2025-08-14 22:11:36] ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step500 has been deleted successfully as cfg.save_total_limit! [2025-08-14 22:13:54] {'loss': '0.6308', 'loss_video': '0.2332', 'loss_audio': '0.3976', 'step': 1009, 'global_step': 1009} [2025-08-14 22:16:20] {'loss': '0.7278', 'loss_video': '0.2532', 'loss_audio': '0.4746', 'step': 1019, 'global_step': 1019} [2025-08-14 22:18:46] {'loss': '0.6938', 'loss_video': '0.2723', 'loss_audio': '0.4215', 'step': 1029, 'global_step': 1029} [2025-08-14 22:21:21] {'loss': '0.6224', 'loss_video': '0.2357', 'loss_audio': '0.3867', 'step': 1039, 'global_step': 1039} [2025-08-14 22:23:45] {'loss': '0.6683', 'loss_video': '0.2615', 'loss_audio': '0.4068', 'step': 1049, 'global_step': 1049} [2025-08-14 22:25:44] {'loss': '0.6627', 'loss_video': '0.2389', 'loss_audio': '0.4238', 'step': 1059, 'global_step': 1059} [2025-08-14 22:28:16] {'loss': '0.6491', 'loss_video': '0.2532', 'loss_audio': '0.3960', 'step': 1069, 'global_step': 1069} [2025-08-14 22:30:27] {'loss': '0.7730', 'loss_video': '0.3141', 'loss_audio': '0.4590', 'step': 1079, 'global_step': 1079} [2025-08-14 22:32:51] {'loss': '0.7272', 'loss_video': '0.2633', 'loss_audio': '0.4639', 'step': 1089, 'global_step': 1089} [2025-08-14 22:35:15] {'loss': '0.7188', 'loss_video': '0.2911', 'loss_audio': '0.4277', 'step': 1099, 'global_step': 1099} [2025-08-14 22:37:45] {'loss': '0.7428', 'loss_video': '0.2496', 'loss_audio': '0.4931', 'step': 1109, 'global_step': 1109} [2025-08-14 22:40:15] {'loss': '0.6475', 'loss_video': '0.2369', 'loss_audio': '0.4106', 'step': 1119, 'global_step': 1119} [2025-08-14 22:42:37] {'loss': '0.7178', 'loss_video': '0.2717', 'loss_audio': '0.4461', 'step': 1129, 'global_step': 1129} [2025-08-14 22:45:17] {'loss': '0.6458', 'loss_video': '0.2332', 'loss_audio': '0.4126', 'step': 1139, 'global_step': 1139} [2025-08-14 22:47:43] {'loss': '0.6504', 'loss_video': '0.2316', 'loss_audio': '0.4188', 'step': 1149, 'global_step': 1149} [2025-08-14 22:50:03] {'loss': '0.6584', 'loss_video': '0.2633', 'loss_audio': '0.3951', 'step': 1159, 'global_step': 1159} [2025-08-14 22:52:38] {'loss': '0.7234', 'loss_video': '0.2774', 'loss_audio': '0.4460', 'step': 1169, 'global_step': 1169} [2025-08-14 22:55:14] {'loss': '0.7338', 'loss_video': '0.2872', 'loss_audio': '0.4465', 'step': 1179, 'global_step': 1179} [2025-08-14 22:58:03] {'loss': '0.6950', 'loss_video': '0.3019', 'loss_audio': '0.3930', 'step': 1189, 'global_step': 1189} [2025-08-14 23:00:35] {'loss': '0.6626', 'loss_video': '0.2341', 'loss_audio': '0.4284', 'step': 1199, 'global_step': 1199} [2025-08-14 23:03:12] {'loss': '0.7572', 'loss_video': '0.3030', 'loss_audio': '0.4542', 'step': 1209, 'global_step': 1209} [2025-08-14 23:05:26] {'loss': '0.7260', 'loss_video': '0.2969', 'loss_audio': '0.4291', 'step': 1219, 'global_step': 1219} [2025-08-14 23:07:44] {'loss': '0.7288', 'loss_video': '0.2796', 'loss_audio': '0.4492', 'step': 1229, 'global_step': 1229} [2025-08-14 23:10:20] {'loss': '0.6255', 'loss_video': '0.2399', 'loss_audio': '0.3856', 'step': 1239, 'global_step': 1239} [2025-08-14 23:12:56] {'loss': '0.7177', 'loss_video': '0.2681', 'loss_audio': '0.4496', 'step': 1249, 'global_step': 1249} [2025-08-14 23:13:02] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-14 23:13:17] Saved checkpoint at epoch 0, step 1250, global_step 1250 to ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step1250 [2025-08-14 23:13:18] ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step750 has been deleted successfully as cfg.save_total_limit! [2025-08-14 23:15:34] {'loss': '0.6535', 'loss_video': '0.2660', 'loss_audio': '0.3876', 'step': 1259, 'global_step': 1259} [2025-08-14 23:17:55] {'loss': '0.6365', 'loss_video': '0.2388', 'loss_audio': '0.3977', 'step': 1269, 'global_step': 1269} [2025-08-14 23:20:28] {'loss': '0.6766', 'loss_video': '0.2450', 'loss_audio': '0.4316', 'step': 1279, 'global_step': 1279} [2025-08-14 23:22:41] {'loss': '0.6591', 'loss_video': '0.2650', 'loss_audio': '0.3941', 'step': 1289, 'global_step': 1289} [2025-08-14 23:25:12] {'loss': '0.6834', 'loss_video': '0.2569', 'loss_audio': '0.4265', 'step': 1299, 'global_step': 1299} [2025-08-14 23:27:41] {'loss': '0.7757', 'loss_video': '0.2959', 'loss_audio': '0.4798', 'step': 1309, 'global_step': 1309} [2025-08-14 23:30:27] {'loss': '0.7007', 'loss_video': '0.2565', 'loss_audio': '0.4441', 'step': 1319, 'global_step': 1319} [2025-08-14 23:33:16] {'loss': '0.6907', 'loss_video': '0.2822', 'loss_audio': '0.4085', 'step': 1329, 'global_step': 1329} [2025-08-14 23:35:30] {'loss': '0.6729', 'loss_video': '0.2537', 'loss_audio': '0.4192', 'step': 1339, 'global_step': 1339} [2025-08-14 23:38:05] {'loss': '0.6715', 'loss_video': '0.2695', 'loss_audio': '0.4021', 'step': 1349, 'global_step': 1349} [2025-08-14 23:40:18] {'loss': '0.6379', 'loss_video': '0.2254', 'loss_audio': '0.4126', 'step': 1359, 'global_step': 1359} [2025-08-14 23:43:02] {'loss': '0.7082', 'loss_video': '0.2976', 'loss_audio': '0.4106', 'step': 1369, 'global_step': 1369} [2025-08-14 23:45:44] {'loss': '0.7550', 'loss_video': '0.2776', 'loss_audio': '0.4774', 'step': 1379, 'global_step': 1379} [2025-08-14 23:47:43] {'loss': '0.6700', 'loss_video': '0.2426', 'loss_audio': '0.4274', 'step': 1389, 'global_step': 1389} [2025-08-14 23:50:11] {'loss': '0.7661', 'loss_video': '0.2892', 'loss_audio': '0.4769', 'step': 1399, 'global_step': 1399} [2025-08-14 23:52:39] {'loss': '0.6683', 'loss_video': '0.2636', 'loss_audio': '0.4048', 'step': 1409, 'global_step': 1409} [2025-08-14 23:54:53] {'loss': '0.6893', 'loss_video': '0.2638', 'loss_audio': '0.4256', 'step': 1419, 'global_step': 1419} [2025-08-14 23:57:15] {'loss': '0.6890', 'loss_video': '0.2601', 'loss_audio': '0.4289', 'step': 1429, 'global_step': 1429} [2025-08-14 23:59:54] {'loss': '0.7233', 'loss_video': '0.2842', 'loss_audio': '0.4391', 'step': 1439, 'global_step': 1439} [2025-08-15 00:02:26] {'loss': '0.6330', 'loss_video': '0.2483', 'loss_audio': '0.3847', 'step': 1449, 'global_step': 1449} [2025-08-15 00:04:47] {'loss': '0.6800', 'loss_video': '0.2358', 'loss_audio': '0.4442', 'step': 1459, 'global_step': 1459} [2025-08-15 00:07:17] {'loss': '0.6781', 'loss_video': '0.2600', 'loss_audio': '0.4181', 'step': 1469, 'global_step': 1469} [2025-08-15 00:10:00] {'loss': '0.7476', 'loss_video': '0.2694', 'loss_audio': '0.4782', 'step': 1479, 'global_step': 1479} [2025-08-15 00:12:22] {'loss': '0.6753', 'loss_video': '0.2588', 'loss_audio': '0.4165', 'step': 1489, 'global_step': 1489} [2025-08-15 00:14:58] {'loss': '0.7320', 'loss_video': '0.2669', 'loss_audio': '0.4651', 'step': 1499, 'global_step': 1499} [2025-08-15 00:15:04] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 00:15:20] Saved checkpoint at epoch 0, step 1500, global_step 1500 to ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step1500 [2025-08-15 00:15:20] ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step1000 has been deleted successfully as cfg.save_total_limit! [2025-08-15 00:17:49] {'loss': '0.7215', 'loss_video': '0.2450', 'loss_audio': '0.4765', 'step': 1509, 'global_step': 1509} [2025-08-15 00:20:36] {'loss': '0.7134', 'loss_video': '0.2815', 'loss_audio': '0.4318', 'step': 1519, 'global_step': 1519} [2025-08-15 00:23:17] {'loss': '0.7562', 'loss_video': '0.2664', 'loss_audio': '0.4898', 'step': 1529, 'global_step': 1529} [2025-08-15 00:25:59] {'loss': '0.6606', 'loss_video': '0.2512', 'loss_audio': '0.4094', 'step': 1539, 'global_step': 1539} [2025-08-15 00:28:19] {'loss': '0.7124', 'loss_video': '0.2793', 'loss_audio': '0.4331', 'step': 1549, 'global_step': 1549} [2025-08-15 00:30:39] {'loss': '0.7327', 'loss_video': '0.2855', 'loss_audio': '0.4472', 'step': 1559, 'global_step': 1559} [2025-08-15 00:33:13] {'loss': '0.7380', 'loss_video': '0.2825', 'loss_audio': '0.4555', 'step': 1569, 'global_step': 1569} [2025-08-15 00:35:44] {'loss': '0.6097', 'loss_video': '0.2297', 'loss_audio': '0.3800', 'step': 1579, 'global_step': 1579} [2025-08-15 00:38:22] {'loss': '0.7373', 'loss_video': '0.3022', 'loss_audio': '0.4351', 'step': 1589, 'global_step': 1589} [2025-08-15 00:40:57] {'loss': '0.7022', 'loss_video': '0.2680', 'loss_audio': '0.4342', 'step': 1599, 'global_step': 1599} [2025-08-15 00:43:35] {'loss': '0.7057', 'loss_video': '0.2755', 'loss_audio': '0.4302', 'step': 1609, 'global_step': 1609} [2025-08-15 00:45:59] {'loss': '0.6906', 'loss_video': '0.2756', 'loss_audio': '0.4150', 'step': 1619, 'global_step': 1619} [2025-08-15 00:48:11] {'loss': '0.6354', 'loss_video': '0.2248', 'loss_audio': '0.4106', 'step': 1629, 'global_step': 1629} [2025-08-15 00:50:42] {'loss': '0.7065', 'loss_video': '0.2562', 'loss_audio': '0.4503', 'step': 1639, 'global_step': 1639} [2025-08-15 00:53:09] {'loss': '0.7470', 'loss_video': '0.2996', 'loss_audio': '0.4474', 'step': 1649, 'global_step': 1649} [2025-08-15 00:55:27] {'loss': '0.7066', 'loss_video': '0.2630', 'loss_audio': '0.4436', 'step': 1659, 'global_step': 1659} [2025-08-15 00:58:10] {'loss': '0.6765', 'loss_video': '0.2197', 'loss_audio': '0.4568', 'step': 1669, 'global_step': 1669} [2025-08-15 01:01:02] {'loss': '0.7021', 'loss_video': '0.2816', 'loss_audio': '0.4205', 'step': 1679, 'global_step': 1679} [2025-08-15 01:03:17] {'loss': '0.6536', 'loss_video': '0.2261', 'loss_audio': '0.4275', 'step': 1689, 'global_step': 1689} [2025-08-15 01:05:43] {'loss': '0.6590', 'loss_video': '0.2434', 'loss_audio': '0.4156', 'step': 1699, 'global_step': 1699} [2025-08-15 01:08:03] {'loss': '0.6739', 'loss_video': '0.2763', 'loss_audio': '0.3976', 'step': 1709, 'global_step': 1709} [2025-08-15 01:10:39] {'loss': '0.7146', 'loss_video': '0.2805', 'loss_audio': '0.4340', 'step': 1719, 'global_step': 1719} [2025-08-15 01:13:08] {'loss': '0.7519', 'loss_video': '0.2989', 'loss_audio': '0.4530', 'step': 1729, 'global_step': 1729} [2025-08-15 01:15:26] {'loss': '0.7181', 'loss_video': '0.2682', 'loss_audio': '0.4499', 'step': 1739, 'global_step': 1739} [2025-08-15 01:17:42] {'loss': '0.6828', 'loss_video': '0.2523', 'loss_audio': '0.4305', 'step': 1749, 'global_step': 1749} [2025-08-15 01:17:48] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 01:18:04] Saved checkpoint at epoch 0, step 1750, global_step 1750 to ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step1750 [2025-08-15 01:18:04] ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step1250 has been deleted successfully as cfg.save_total_limit! [2025-08-15 01:20:19] {'loss': '0.6897', 'loss_video': '0.2606', 'loss_audio': '0.4291', 'step': 1759, 'global_step': 1759} [2025-08-15 01:23:02] {'loss': '0.7142', 'loss_video': '0.2745', 'loss_audio': '0.4397', 'step': 1769, 'global_step': 1769} [2025-08-15 01:25:37] {'loss': '0.7238', 'loss_video': '0.2907', 'loss_audio': '0.4330', 'step': 1779, 'global_step': 1779} [2025-08-15 01:28:03] {'loss': '0.6775', 'loss_video': '0.2621', 'loss_audio': '0.4154', 'step': 1789, 'global_step': 1789} [2025-08-15 01:30:36] {'loss': '0.6536', 'loss_video': '0.2541', 'loss_audio': '0.3995', 'step': 1799, 'global_step': 1799} [2025-08-15 01:33:02] {'loss': '0.7145', 'loss_video': '0.2745', 'loss_audio': '0.4400', 'step': 1809, 'global_step': 1809} [2025-08-15 01:35:29] {'loss': '0.6730', 'loss_video': '0.2842', 'loss_audio': '0.3888', 'step': 1819, 'global_step': 1819} [2025-08-15 01:38:05] {'loss': '0.7005', 'loss_video': '0.2725', 'loss_audio': '0.4280', 'step': 1829, 'global_step': 1829} [2025-08-15 01:40:49] {'loss': '0.6763', 'loss_video': '0.2625', 'loss_audio': '0.4138', 'step': 1839, 'global_step': 1839} [2025-08-15 01:43:05] {'loss': '0.6558', 'loss_video': '0.2553', 'loss_audio': '0.4005', 'step': 1849, 'global_step': 1849} [2025-08-15 01:45:32] {'loss': '0.6823', 'loss_video': '0.2640', 'loss_audio': '0.4184', 'step': 1859, 'global_step': 1859} [2025-08-15 01:48:01] {'loss': '0.6723', 'loss_video': '0.2558', 'loss_audio': '0.4165', 'step': 1869, 'global_step': 1869} [2025-08-15 01:50:26] {'loss': '0.7093', 'loss_video': '0.2985', 'loss_audio': '0.4108', 'step': 1879, 'global_step': 1879} [2025-08-15 01:52:53] {'loss': '0.6665', 'loss_video': '0.2501', 'loss_audio': '0.4164', 'step': 1889, 'global_step': 1889} [2025-08-15 01:55:14] {'loss': '0.7115', 'loss_video': '0.2614', 'loss_audio': '0.4502', 'step': 1899, 'global_step': 1899} [2025-08-15 01:57:49] {'loss': '0.6503', 'loss_video': '0.2267', 'loss_audio': '0.4236', 'step': 1909, 'global_step': 1909} [2025-08-15 02:00:14] {'loss': '0.7093', 'loss_video': '0.2702', 'loss_audio': '0.4391', 'step': 1919, 'global_step': 1919} [2025-08-15 02:02:34] {'loss': '0.7250', 'loss_video': '0.2727', 'loss_audio': '0.4523', 'step': 1929, 'global_step': 1929} [2025-08-15 02:05:26] {'loss': '0.7824', 'loss_video': '0.2744', 'loss_audio': '0.5080', 'step': 1939, 'global_step': 1939} [2025-08-15 02:07:56] {'loss': '0.7310', 'loss_video': '0.2799', 'loss_audio': '0.4511', 'step': 1949, 'global_step': 1949} [2025-08-15 02:10:47] {'loss': '0.7654', 'loss_video': '0.2729', 'loss_audio': '0.4925', 'step': 1959, 'global_step': 1959} [2025-08-15 02:13:32] {'loss': '0.7284', 'loss_video': '0.2615', 'loss_audio': '0.4669', 'step': 1969, 'global_step': 1969} [2025-08-15 02:15:56] {'loss': '0.6636', 'loss_video': '0.2507', 'loss_audio': '0.4129', 'step': 1979, 'global_step': 1979} [2025-08-15 02:18:23] {'loss': '0.6983', 'loss_video': '0.2412', 'loss_audio': '0.4571', 'step': 1989, 'global_step': 1989} [2025-08-15 02:20:42] {'loss': '0.6576', 'loss_video': '0.2589', 'loss_audio': '0.3987', 'step': 1999, 'global_step': 1999} [2025-08-15 02:20:48] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 02:21:03] Saved checkpoint at epoch 0, step 2000, global_step 2000 to ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step2000 [2025-08-15 02:21:04] ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step1500 has been deleted successfully as cfg.save_total_limit! [2025-08-15 02:23:24] {'loss': '0.6561', 'loss_video': '0.2396', 'loss_audio': '0.4165', 'step': 2009, 'global_step': 2009} [2025-08-15 02:26:09] {'loss': '0.7363', 'loss_video': '0.2746', 'loss_audio': '0.4616', 'step': 2019, 'global_step': 2019} [2025-08-15 02:28:53] {'loss': '0.6429', 'loss_video': '0.2618', 'loss_audio': '0.3811', 'step': 2029, 'global_step': 2029} [2025-08-15 02:31:26] {'loss': '0.6480', 'loss_video': '0.2479', 'loss_audio': '0.4001', 'step': 2039, 'global_step': 2039} [2025-08-15 02:33:46] {'loss': '0.7332', 'loss_video': '0.2921', 'loss_audio': '0.4411', 'step': 2049, 'global_step': 2049} [2025-08-15 02:36:15] {'loss': '0.6706', 'loss_video': '0.2716', 'loss_audio': '0.3990', 'step': 2059, 'global_step': 2059} [2025-08-15 02:38:13] {'loss': '0.7590', 'loss_video': '0.2850', 'loss_audio': '0.4740', 'step': 2069, 'global_step': 2069} [2025-08-15 02:40:36] {'loss': '0.7394', 'loss_video': '0.2902', 'loss_audio': '0.4492', 'step': 2079, 'global_step': 2079} [2025-08-15 02:43:11] {'loss': '0.6327', 'loss_video': '0.2378', 'loss_audio': '0.3949', 'step': 2089, 'global_step': 2089} [2025-08-15 02:45:52] {'loss': '0.7542', 'loss_video': '0.2846', 'loss_audio': '0.4695', 'step': 2099, 'global_step': 2099} [2025-08-15 02:48:26] {'loss': '0.7231', 'loss_video': '0.3023', 'loss_audio': '0.4209', 'step': 2109, 'global_step': 2109} [2025-08-15 02:50:57] {'loss': '0.6904', 'loss_video': '0.2667', 'loss_audio': '0.4236', 'step': 2119, 'global_step': 2119} [2025-08-15 02:53:14] {'loss': '0.6858', 'loss_video': '0.2613', 'loss_audio': '0.4246', 'step': 2129, 'global_step': 2129} [2025-08-15 02:56:05] {'loss': '0.6434', 'loss_video': '0.2570', 'loss_audio': '0.3864', 'step': 2139, 'global_step': 2139} [2025-08-15 02:58:31] {'loss': '0.6955', 'loss_video': '0.2592', 'loss_audio': '0.4363', 'step': 2149, 'global_step': 2149} [2025-08-15 03:01:12] {'loss': '0.7959', 'loss_video': '0.3050', 'loss_audio': '0.4909', 'step': 2159, 'global_step': 2159} [2025-08-15 03:03:40] {'loss': '0.6701', 'loss_video': '0.2442', 'loss_audio': '0.4259', 'step': 2169, 'global_step': 2169} [2025-08-15 03:06:20] {'loss': '0.6875', 'loss_video': '0.2579', 'loss_audio': '0.4296', 'step': 2179, 'global_step': 2179} [2025-08-15 03:08:43] {'loss': '0.7148', 'loss_video': '0.2663', 'loss_audio': '0.4485', 'step': 2189, 'global_step': 2189} [2025-08-15 03:11:22] {'loss': '0.6581', 'loss_video': '0.2452', 'loss_audio': '0.4129', 'step': 2199, 'global_step': 2199} [2025-08-15 03:13:52] {'loss': '0.6788', 'loss_video': '0.2487', 'loss_audio': '0.4301', 'step': 2209, 'global_step': 2209} [2025-08-15 03:16:32] {'loss': '0.6739', 'loss_video': '0.2748', 'loss_audio': '0.3991', 'step': 2219, 'global_step': 2219} [2025-08-15 03:18:47] {'loss': '0.6501', 'loss_video': '0.2477', 'loss_audio': '0.4023', 'step': 2229, 'global_step': 2229} [2025-08-15 03:21:06] {'loss': '0.6985', 'loss_video': '0.2556', 'loss_audio': '0.4429', 'step': 2239, 'global_step': 2239} [2025-08-15 03:23:38] {'loss': '0.6790', 'loss_video': '0.2736', 'loss_audio': '0.4054', 'step': 2249, 'global_step': 2249} [2025-08-15 03:23:44] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 03:24:02] Saved checkpoint at epoch 0, step 2250, global_step 2250 to ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step2250 [2025-08-15 03:24:02] ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step1750 has been deleted successfully as cfg.save_total_limit! [2025-08-15 03:26:35] {'loss': '0.7266', 'loss_video': '0.3042', 'loss_audio': '0.4224', 'step': 2259, 'global_step': 2259} [2025-08-15 03:29:12] {'loss': '0.7133', 'loss_video': '0.2913', 'loss_audio': '0.4220', 'step': 2269, 'global_step': 2269} [2025-08-15 03:31:34] {'loss': '0.6855', 'loss_video': '0.2267', 'loss_audio': '0.4588', 'step': 2279, 'global_step': 2279} [2025-08-15 03:34:01] {'loss': '0.7425', 'loss_video': '0.2750', 'loss_audio': '0.4675', 'step': 2289, 'global_step': 2289} [2025-08-15 03:36:20] {'loss': '0.6394', 'loss_video': '0.2315', 'loss_audio': '0.4079', 'step': 2299, 'global_step': 2299} [2025-08-15 03:38:51] {'loss': '0.7677', 'loss_video': '0.3220', 'loss_audio': '0.4457', 'step': 2309, 'global_step': 2309} [2025-08-15 03:41:16] {'loss': '0.6637', 'loss_video': '0.2552', 'loss_audio': '0.4086', 'step': 2319, 'global_step': 2319} [2025-08-15 03:43:46] {'loss': '0.6763', 'loss_video': '0.2430', 'loss_audio': '0.4333', 'step': 2329, 'global_step': 2329} [2025-08-15 03:46:06] {'loss': '0.7209', 'loss_video': '0.2807', 'loss_audio': '0.4402', 'step': 2339, 'global_step': 2339} [2025-08-15 03:48:32] {'loss': '0.7217', 'loss_video': '0.2530', 'loss_audio': '0.4687', 'step': 2349, 'global_step': 2349} [2025-08-15 03:51:04] {'loss': '0.7156', 'loss_video': '0.2698', 'loss_audio': '0.4458', 'step': 2359, 'global_step': 2359} [2025-08-15 03:53:42] {'loss': '0.7385', 'loss_video': '0.2579', 'loss_audio': '0.4806', 'step': 2369, 'global_step': 2369} [2025-08-15 03:55:56] {'loss': '0.6388', 'loss_video': '0.2440', 'loss_audio': '0.3948', 'step': 2379, 'global_step': 2379} [2025-08-15 03:58:43] {'loss': '0.6828', 'loss_video': '0.2514', 'loss_audio': '0.4314', 'step': 2389, 'global_step': 2389} [2025-08-15 04:01:01] {'loss': '0.6779', 'loss_video': '0.2563', 'loss_audio': '0.4216', 'step': 2399, 'global_step': 2399} [2025-08-15 04:03:20] {'loss': '0.6684', 'loss_video': '0.2537', 'loss_audio': '0.4147', 'step': 2409, 'global_step': 2409} [2025-08-15 04:05:59] {'loss': '0.6637', 'loss_video': '0.2668', 'loss_audio': '0.3969', 'step': 2419, 'global_step': 2419} [2025-08-15 04:08:09] {'loss': '0.6501', 'loss_video': '0.2388', 'loss_audio': '0.4113', 'step': 2429, 'global_step': 2429} [2025-08-15 04:10:32] {'loss': '0.6265', 'loss_video': '0.2245', 'loss_audio': '0.4020', 'step': 2439, 'global_step': 2439} [2025-08-15 04:13:21] {'loss': '0.6423', 'loss_video': '0.2440', 'loss_audio': '0.3983', 'step': 2449, 'global_step': 2449} [2025-08-15 04:15:44] {'loss': '0.6424', 'loss_video': '0.2500', 'loss_audio': '0.3924', 'step': 2459, 'global_step': 2459} [2025-08-15 04:18:28] {'loss': '0.6158', 'loss_video': '0.2462', 'loss_audio': '0.3697', 'step': 2469, 'global_step': 2469} [2025-08-15 04:21:09] {'loss': '0.6330', 'loss_video': '0.2499', 'loss_audio': '0.3831', 'step': 2479, 'global_step': 2479} [2025-08-15 04:23:28] {'loss': '0.7625', 'loss_video': '0.2858', 'loss_audio': '0.4767', 'step': 2489, 'global_step': 2489} [2025-08-15 04:26:00] {'loss': '0.6868', 'loss_video': '0.2789', 'loss_audio': '0.4078', 'step': 2499, 'global_step': 2499} [2025-08-15 04:26:06] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 04:26:23] Saved checkpoint at epoch 0, step 2500, global_step 2500 to ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step2500 [2025-08-15 04:26:24] ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step2000 has been deleted successfully as cfg.save_total_limit! [2025-08-15 04:28:55] {'loss': '0.6611', 'loss_video': '0.2514', 'loss_audio': '0.4097', 'step': 2509, 'global_step': 2509} [2025-08-15 04:31:26] {'loss': '0.6829', 'loss_video': '0.2512', 'loss_audio': '0.4317', 'step': 2519, 'global_step': 2519} [2025-08-15 04:33:51] {'loss': '0.7849', 'loss_video': '0.3522', 'loss_audio': '0.4327', 'step': 2529, 'global_step': 2529} [2025-08-15 04:36:03] {'loss': '0.6668', 'loss_video': '0.2333', 'loss_audio': '0.4335', 'step': 2539, 'global_step': 2539} [2025-08-15 04:38:24] {'loss': '0.7272', 'loss_video': '0.2845', 'loss_audio': '0.4426', 'step': 2549, 'global_step': 2549} [2025-08-15 04:40:37] {'loss': '0.6964', 'loss_video': '0.2478', 'loss_audio': '0.4486', 'step': 2559, 'global_step': 2559} [2025-08-15 04:42:55] {'loss': '0.7203', 'loss_video': '0.2718', 'loss_audio': '0.4485', 'step': 2569, 'global_step': 2569} [2025-08-15 04:45:37] {'loss': '0.7381', 'loss_video': '0.2928', 'loss_audio': '0.4453', 'step': 2579, 'global_step': 2579} [2025-08-15 04:48:06] {'loss': '0.6687', 'loss_video': '0.2546', 'loss_audio': '0.4141', 'step': 2589, 'global_step': 2589} [2025-08-15 04:50:36] {'loss': '0.7143', 'loss_video': '0.2813', 'loss_audio': '0.4330', 'step': 2599, 'global_step': 2599} [2025-08-15 04:52:55] {'loss': '0.7189', 'loss_video': '0.2441', 'loss_audio': '0.4748', 'step': 2609, 'global_step': 2609} [2025-08-15 04:55:30] {'loss': '0.7017', 'loss_video': '0.2865', 'loss_audio': '0.4152', 'step': 2619, 'global_step': 2619} [2025-08-15 04:57:56] {'loss': '0.7354', 'loss_video': '0.3006', 'loss_audio': '0.4347', 'step': 2629, 'global_step': 2629} [2025-08-15 05:00:42] {'loss': '0.7369', 'loss_video': '0.2998', 'loss_audio': '0.4371', 'step': 2639, 'global_step': 2639} [2025-08-15 05:03:18] {'loss': '0.7067', 'loss_video': '0.2843', 'loss_audio': '0.4224', 'step': 2649, 'global_step': 2649} [2025-08-15 05:05:40] {'loss': '0.7348', 'loss_video': '0.2623', 'loss_audio': '0.4725', 'step': 2659, 'global_step': 2659} [2025-08-15 05:08:16] {'loss': '0.6911', 'loss_video': '0.2766', 'loss_audio': '0.4144', 'step': 2669, 'global_step': 2669} [2025-08-15 05:10:41] {'loss': '0.6456', 'loss_video': '0.2381', 'loss_audio': '0.4075', 'step': 2679, 'global_step': 2679} [2025-08-15 05:13:15] {'loss': '0.6818', 'loss_video': '0.2646', 'loss_audio': '0.4173', 'step': 2689, 'global_step': 2689} [2025-08-15 05:15:31] {'loss': '0.6899', 'loss_video': '0.2685', 'loss_audio': '0.4214', 'step': 2699, 'global_step': 2699} [2025-08-15 05:18:16] {'loss': '0.6795', 'loss_video': '0.2552', 'loss_audio': '0.4243', 'step': 2709, 'global_step': 2709} [2025-08-15 05:20:48] {'loss': '0.7311', 'loss_video': '0.2574', 'loss_audio': '0.4736', 'step': 2719, 'global_step': 2719} [2025-08-15 05:23:29] {'loss': '0.6129', 'loss_video': '0.2283', 'loss_audio': '0.3845', 'step': 2729, 'global_step': 2729} [2025-08-15 05:25:58] {'loss': '0.6570', 'loss_video': '0.2404', 'loss_audio': '0.4165', 'step': 2739, 'global_step': 2739} [2025-08-15 05:28:26] {'loss': '0.6680', 'loss_video': '0.2603', 'loss_audio': '0.4077', 'step': 2749, 'global_step': 2749} [2025-08-15 05:28:32] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 05:28:48] Saved checkpoint at epoch 0, step 2750, global_step 2750 to ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step2750 [2025-08-15 05:28:50] ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step2250 has been deleted successfully as cfg.save_total_limit! [2025-08-15 05:31:26] {'loss': '0.7401', 'loss_video': '0.2720', 'loss_audio': '0.4681', 'step': 2759, 'global_step': 2759} [2025-08-15 05:33:44] {'loss': '0.7614', 'loss_video': '0.2920', 'loss_audio': '0.4694', 'step': 2769, 'global_step': 2769} [2025-08-15 05:36:10] {'loss': '0.7315', 'loss_video': '0.2897', 'loss_audio': '0.4417', 'step': 2779, 'global_step': 2779} [2025-08-15 05:38:43] {'loss': '0.7198', 'loss_video': '0.2706', 'loss_audio': '0.4492', 'step': 2789, 'global_step': 2789} [2025-08-15 05:41:11] {'loss': '0.8059', 'loss_video': '0.3464', 'loss_audio': '0.4595', 'step': 2799, 'global_step': 2799} [2025-08-15 05:43:48] {'loss': '0.8157', 'loss_video': '0.3190', 'loss_audio': '0.4966', 'step': 2809, 'global_step': 2809} [2025-08-15 05:46:21] {'loss': '0.6843', 'loss_video': '0.2691', 'loss_audio': '0.4153', 'step': 2819, 'global_step': 2819} [2025-08-15 05:48:58] {'loss': '0.7103', 'loss_video': '0.2792', 'loss_audio': '0.4311', 'step': 2829, 'global_step': 2829} [2025-08-15 05:51:20] {'loss': '0.6321', 'loss_video': '0.2331', 'loss_audio': '0.3990', 'step': 2839, 'global_step': 2839} [2025-08-15 05:53:49] {'loss': '0.7406', 'loss_video': '0.2970', 'loss_audio': '0.4436', 'step': 2849, 'global_step': 2849} [2025-08-15 05:56:17] {'loss': '0.6403', 'loss_video': '0.2511', 'loss_audio': '0.3892', 'step': 2859, 'global_step': 2859} [2025-08-15 05:59:07] {'loss': '0.7822', 'loss_video': '0.2763', 'loss_audio': '0.5060', 'step': 2869, 'global_step': 2869} [2025-08-15 06:01:57] {'loss': '0.7047', 'loss_video': '0.2874', 'loss_audio': '0.4173', 'step': 2879, 'global_step': 2879} [2025-08-15 06:04:28] {'loss': '0.7480', 'loss_video': '0.2802', 'loss_audio': '0.4678', 'step': 2889, 'global_step': 2889} [2025-08-15 06:07:09] {'loss': '0.7293', 'loss_video': '0.2813', 'loss_audio': '0.4480', 'step': 2899, 'global_step': 2899} [2025-08-15 06:09:36] {'loss': '0.6327', 'loss_video': '0.2595', 'loss_audio': '0.3733', 'step': 2909, 'global_step': 2909} [2025-08-15 06:12:21] {'loss': '0.6598', 'loss_video': '0.2312', 'loss_audio': '0.4286', 'step': 2919, 'global_step': 2919} [2025-08-15 06:14:47] {'loss': '0.6674', 'loss_video': '0.2530', 'loss_audio': '0.4144', 'step': 2929, 'global_step': 2929} [2025-08-15 06:17:20] {'loss': '0.6535', 'loss_video': '0.2371', 'loss_audio': '0.4164', 'step': 2939, 'global_step': 2939} [2025-08-15 06:19:59] {'loss': '0.6769', 'loss_video': '0.2602', 'loss_audio': '0.4167', 'step': 2949, 'global_step': 2949} [2025-08-15 06:22:38] {'loss': '0.7024', 'loss_video': '0.2636', 'loss_audio': '0.4388', 'step': 2959, 'global_step': 2959} [2025-08-15 06:25:09] {'loss': '0.7669', 'loss_video': '0.3103', 'loss_audio': '0.4565', 'step': 2969, 'global_step': 2969} [2025-08-15 06:27:41] {'loss': '0.7024', 'loss_video': '0.2725', 'loss_audio': '0.4299', 'step': 2979, 'global_step': 2979} [2025-08-15 06:30:16] {'loss': '0.6771', 'loss_video': '0.2582', 'loss_audio': '0.4189', 'step': 2989, 'global_step': 2989} [2025-08-15 06:32:25] {'loss': '0.7113', 'loss_video': '0.2586', 'loss_audio': '0.4527', 'step': 2999, 'global_step': 2999} [2025-08-15 06:32:32] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 06:32:52] Saved checkpoint at epoch 0, step 3000, global_step 3000 to ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step3000 [2025-08-15 06:32:53] ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step2500 has been deleted successfully as cfg.save_total_limit! [2025-08-15 06:35:19] {'loss': '0.6671', 'loss_video': '0.2455', 'loss_audio': '0.4217', 'step': 3009, 'global_step': 3009} [2025-08-15 06:37:54] {'loss': '0.6434', 'loss_video': '0.2667', 'loss_audio': '0.3767', 'step': 3019, 'global_step': 3019} [2025-08-15 06:40:35] {'loss': '0.6847', 'loss_video': '0.2788', 'loss_audio': '0.4060', 'step': 3029, 'global_step': 3029} [2025-08-15 06:42:56] {'loss': '0.6492', 'loss_video': '0.2395', 'loss_audio': '0.4097', 'step': 3039, 'global_step': 3039} [2025-08-15 06:45:25] {'loss': '0.7648', 'loss_video': '0.2894', 'loss_audio': '0.4754', 'step': 3049, 'global_step': 3049} [2025-08-15 06:47:48] {'loss': '0.6483', 'loss_video': '0.2413', 'loss_audio': '0.4070', 'step': 3059, 'global_step': 3059} [2025-08-15 06:50:17] {'loss': '0.7513', 'loss_video': '0.2830', 'loss_audio': '0.4683', 'step': 3069, 'global_step': 3069} [2025-08-15 06:52:15] {'loss': '0.7594', 'loss_video': '0.2604', 'loss_audio': '0.4990', 'step': 3079, 'global_step': 3079} [2025-08-15 06:54:42] {'loss': '0.7330', 'loss_video': '0.2861', 'loss_audio': '0.4469', 'step': 3089, 'global_step': 3089} [2025-08-15 06:57:12] {'loss': '0.6656', 'loss_video': '0.2573', 'loss_audio': '0.4083', 'step': 3099, 'global_step': 3099} [2025-08-15 06:59:50] {'loss': '0.7081', 'loss_video': '0.2694', 'loss_audio': '0.4387', 'step': 3109, 'global_step': 3109} [2025-08-15 07:01:57] {'loss': '0.7046', 'loss_video': '0.2775', 'loss_audio': '0.4272', 'step': 3119, 'global_step': 3119} [2025-08-15 07:04:34] {'loss': '0.7105', 'loss_video': '0.2841', 'loss_audio': '0.4264', 'step': 3129, 'global_step': 3129} [2025-08-15 07:07:05] {'loss': '0.7283', 'loss_video': '0.2796', 'loss_audio': '0.4486', 'step': 3139, 'global_step': 3139} [2025-08-15 07:09:23] {'loss': '0.7436', 'loss_video': '0.2979', 'loss_audio': '0.4457', 'step': 3149, 'global_step': 3149} [2025-08-15 07:11:53] {'loss': '0.6799', 'loss_video': '0.2487', 'loss_audio': '0.4312', 'step': 3159, 'global_step': 3159} [2025-08-15 07:14:39] {'loss': '0.6331', 'loss_video': '0.2454', 'loss_audio': '0.3877', 'step': 3169, 'global_step': 3169} [2025-08-15 07:17:16] {'loss': '0.6619', 'loss_video': '0.2640', 'loss_audio': '0.3978', 'step': 3179, 'global_step': 3179} [2025-08-15 07:19:43] {'loss': '0.6204', 'loss_video': '0.2339', 'loss_audio': '0.3865', 'step': 3189, 'global_step': 3189} [2025-08-15 07:22:10] {'loss': '0.6513', 'loss_video': '0.2274', 'loss_audio': '0.4239', 'step': 3199, 'global_step': 3199} [2025-08-15 07:25:02] {'loss': '0.7048', 'loss_video': '0.2850', 'loss_audio': '0.4198', 'step': 3209, 'global_step': 3209} [2025-08-15 07:27:31] {'loss': '0.6833', 'loss_video': '0.2403', 'loss_audio': '0.4431', 'step': 3219, 'global_step': 3219} [2025-08-15 07:29:57] {'loss': '0.6708', 'loss_video': '0.2721', 'loss_audio': '0.3987', 'step': 3229, 'global_step': 3229} [2025-08-15 07:32:13] {'loss': '0.6452', 'loss_video': '0.2457', 'loss_audio': '0.3995', 'step': 3239, 'global_step': 3239} [2025-08-15 07:34:42] {'loss': '0.7176', 'loss_video': '0.2923', 'loss_audio': '0.4253', 'step': 3249, 'global_step': 3249} [2025-08-15 07:34:48] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 07:35:05] Saved checkpoint at epoch 0, step 3250, global_step 3250 to ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step3250 [2025-08-15 07:35:05] ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step2750 has been deleted successfully as cfg.save_total_limit! [2025-08-15 07:37:43] {'loss': '0.7031', 'loss_video': '0.2755', 'loss_audio': '0.4277', 'step': 3259, 'global_step': 3259} [2025-08-15 07:40:01] {'loss': '0.6687', 'loss_video': '0.2702', 'loss_audio': '0.3985', 'step': 3269, 'global_step': 3269} [2025-08-15 07:42:48] {'loss': '0.6737', 'loss_video': '0.2701', 'loss_audio': '0.4036', 'step': 3279, 'global_step': 3279} [2025-08-15 07:45:15] {'loss': '0.6983', 'loss_video': '0.2772', 'loss_audio': '0.4211', 'step': 3289, 'global_step': 3289} [2025-08-15 07:48:01] {'loss': '0.6894', 'loss_video': '0.2706', 'loss_audio': '0.4188', 'step': 3299, 'global_step': 3299} [2025-08-15 07:50:05] {'loss': '0.6602', 'loss_video': '0.2425', 'loss_audio': '0.4178', 'step': 3309, 'global_step': 3309} [2025-08-15 07:52:41] {'loss': '0.6299', 'loss_video': '0.2531', 'loss_audio': '0.3768', 'step': 3319, 'global_step': 3319} [2025-08-15 07:55:13] {'loss': '0.7173', 'loss_video': '0.2751', 'loss_audio': '0.4422', 'step': 3329, 'global_step': 3329} [2025-08-15 07:57:40] {'loss': '0.7631', 'loss_video': '0.2737', 'loss_audio': '0.4894', 'step': 3339, 'global_step': 3339} [2025-08-15 08:00:02] {'loss': '0.6745', 'loss_video': '0.2598', 'loss_audio': '0.4148', 'step': 3349, 'global_step': 3349} [2025-08-15 08:02:31] {'loss': '0.7472', 'loss_video': '0.2785', 'loss_audio': '0.4686', 'step': 3359, 'global_step': 3359} [2025-08-15 08:04:57] {'loss': '0.6846', 'loss_video': '0.2744', 'loss_audio': '0.4103', 'step': 3369, 'global_step': 3369} [2025-08-15 08:07:28] {'loss': '0.7153', 'loss_video': '0.2643', 'loss_audio': '0.4509', 'step': 3379, 'global_step': 3379} [2025-08-15 08:10:07] {'loss': '0.6943', 'loss_video': '0.2583', 'loss_audio': '0.4360', 'step': 3389, 'global_step': 3389} [2025-08-15 08:12:42] {'loss': '0.6849', 'loss_video': '0.2460', 'loss_audio': '0.4389', 'step': 3399, 'global_step': 3399} [2025-08-15 08:15:05] {'loss': '0.6935', 'loss_video': '0.2474', 'loss_audio': '0.4460', 'step': 3409, 'global_step': 3409} [2025-08-15 08:17:28] {'loss': '0.6117', 'loss_video': '0.2331', 'loss_audio': '0.3786', 'step': 3419, 'global_step': 3419} [2025-08-15 08:19:59] {'loss': '0.7126', 'loss_video': '0.2811', 'loss_audio': '0.4315', 'step': 3429, 'global_step': 3429} [2025-08-15 08:22:47] {'loss': '0.7345', 'loss_video': '0.2972', 'loss_audio': '0.4372', 'step': 3439, 'global_step': 3439} [2025-08-15 08:25:09] {'loss': '0.6288', 'loss_video': '0.2420', 'loss_audio': '0.3868', 'step': 3449, 'global_step': 3449} [2025-08-15 08:27:23] {'loss': '0.6485', 'loss_video': '0.2417', 'loss_audio': '0.4068', 'step': 3459, 'global_step': 3459} [2025-08-15 08:29:51] {'loss': '0.7958', 'loss_video': '0.2945', 'loss_audio': '0.5014', 'step': 3469, 'global_step': 3469} [2025-08-15 08:32:33] {'loss': '0.7222', 'loss_video': '0.2653', 'loss_audio': '0.4570', 'step': 3479, 'global_step': 3479} [2025-08-15 08:35:05] {'loss': '0.6706', 'loss_video': '0.2630', 'loss_audio': '0.4076', 'step': 3489, 'global_step': 3489} [2025-08-15 08:37:26] {'loss': '0.6717', 'loss_video': '0.2547', 'loss_audio': '0.4169', 'step': 3499, 'global_step': 3499} [2025-08-15 08:37:32] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 08:37:48] Saved checkpoint at epoch 0, step 3500, global_step 3500 to ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step3500 [2025-08-15 08:37:48] ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step3000 has been deleted successfully as cfg.save_total_limit! [2025-08-15 08:40:06] {'loss': '0.7121', 'loss_video': '0.2704', 'loss_audio': '0.4418', 'step': 3509, 'global_step': 3509} [2025-08-15 08:42:25] {'loss': '0.7109', 'loss_video': '0.2723', 'loss_audio': '0.4386', 'step': 3519, 'global_step': 3519} [2025-08-15 08:44:59] {'loss': '0.7345', 'loss_video': '0.2763', 'loss_audio': '0.4581', 'step': 3529, 'global_step': 3529} [2025-08-15 08:47:36] {'loss': '0.6603', 'loss_video': '0.2250', 'loss_audio': '0.4353', 'step': 3539, 'global_step': 3539} [2025-08-15 08:50:00] {'loss': '0.7469', 'loss_video': '0.2827', 'loss_audio': '0.4642', 'step': 3549, 'global_step': 3549} [2025-08-15 08:52:42] {'loss': '0.6769', 'loss_video': '0.2688', 'loss_audio': '0.4081', 'step': 3559, 'global_step': 3559} [2025-08-15 08:55:00] {'loss': '0.6996', 'loss_video': '0.2820', 'loss_audio': '0.4177', 'step': 3569, 'global_step': 3569} [2025-08-15 08:57:34] {'loss': '0.6451', 'loss_video': '0.2367', 'loss_audio': '0.4083', 'step': 3579, 'global_step': 3579} [2025-08-15 09:00:13] {'loss': '0.7356', 'loss_video': '0.2646', 'loss_audio': '0.4710', 'step': 3589, 'global_step': 3589} [2025-08-15 09:02:38] {'loss': '0.6932', 'loss_video': '0.2768', 'loss_audio': '0.4164', 'step': 3599, 'global_step': 3599} [2025-08-15 09:05:07] {'loss': '0.6485', 'loss_video': '0.2587', 'loss_audio': '0.3897', 'step': 3609, 'global_step': 3609} [2025-08-15 09:07:37] {'loss': '0.7007', 'loss_video': '0.2776', 'loss_audio': '0.4231', 'step': 3619, 'global_step': 3619} [2025-08-15 09:10:04] {'loss': '0.7086', 'loss_video': '0.2671', 'loss_audio': '0.4415', 'step': 3629, 'global_step': 3629} [2025-08-15 09:12:13] {'loss': '0.6600', 'loss_video': '0.2426', 'loss_audio': '0.4174', 'step': 3639, 'global_step': 3639} [2025-08-15 09:14:39] {'loss': '0.6069', 'loss_video': '0.2277', 'loss_audio': '0.3792', 'step': 3649, 'global_step': 3649} [2025-08-15 09:16:59] {'loss': '0.6441', 'loss_video': '0.2385', 'loss_audio': '0.4056', 'step': 3659, 'global_step': 3659} [2025-08-15 09:19:20] {'loss': '0.6569', 'loss_video': '0.2413', 'loss_audio': '0.4157', 'step': 3669, 'global_step': 3669} [2025-08-15 09:21:40] {'loss': '0.6343', 'loss_video': '0.2293', 'loss_audio': '0.4051', 'step': 3679, 'global_step': 3679} [2025-08-15 09:23:50] {'loss': '0.6662', 'loss_video': '0.2581', 'loss_audio': '0.4081', 'step': 3689, 'global_step': 3689} [2025-08-15 09:26:35] {'loss': '0.7339', 'loss_video': '0.2838', 'loss_audio': '0.4501', 'step': 3699, 'global_step': 3699} [2025-08-15 09:29:01] {'loss': '0.7493', 'loss_video': '0.2861', 'loss_audio': '0.4633', 'step': 3709, 'global_step': 3709} [2025-08-15 09:31:35] {'loss': '0.6984', 'loss_video': '0.2819', 'loss_audio': '0.4166', 'step': 3719, 'global_step': 3719} [2025-08-15 09:33:52] {'loss': '0.5928', 'loss_video': '0.2246', 'loss_audio': '0.3682', 'step': 3729, 'global_step': 3729} [2025-08-15 09:36:07] {'loss': '0.7326', 'loss_video': '0.2623', 'loss_audio': '0.4703', 'step': 3739, 'global_step': 3739} [2025-08-15 09:38:53] {'loss': '0.6790', 'loss_video': '0.2565', 'loss_audio': '0.4225', 'step': 3749, 'global_step': 3749} [2025-08-15 09:38:59] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 09:39:16] Saved checkpoint at epoch 0, step 3750, global_step 3750 to ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step3750 [2025-08-15 09:39:16] ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step3250 has been deleted successfully as cfg.save_total_limit! [2025-08-15 09:41:48] {'loss': '0.6778', 'loss_video': '0.2416', 'loss_audio': '0.4363', 'step': 3759, 'global_step': 3759} [2025-08-15 09:44:08] {'loss': '0.7376', 'loss_video': '0.2907', 'loss_audio': '0.4469', 'step': 3769, 'global_step': 3769} [2025-08-15 09:46:27] {'loss': '0.7041', 'loss_video': '0.3034', 'loss_audio': '0.4007', 'step': 3779, 'global_step': 3779} [2025-08-15 09:48:42] {'loss': '0.7373', 'loss_video': '0.2583', 'loss_audio': '0.4789', 'step': 3789, 'global_step': 3789} [2025-08-15 09:50:54] {'loss': '0.6880', 'loss_video': '0.2566', 'loss_audio': '0.4314', 'step': 3799, 'global_step': 3799} [2025-08-15 09:53:23] {'loss': '0.7231', 'loss_video': '0.2959', 'loss_audio': '0.4271', 'step': 3809, 'global_step': 3809} [2025-08-15 09:55:34] {'loss': '0.6998', 'loss_video': '0.2774', 'loss_audio': '0.4223', 'step': 3819, 'global_step': 3819} [2025-08-15 09:58:07] {'loss': '0.7210', 'loss_video': '0.2796', 'loss_audio': '0.4414', 'step': 3829, 'global_step': 3829} [2025-08-15 10:00:45] {'loss': '0.6706', 'loss_video': '0.2550', 'loss_audio': '0.4155', 'step': 3839, 'global_step': 3839} [2025-08-15 10:03:21] {'loss': '0.6640', 'loss_video': '0.2457', 'loss_audio': '0.4182', 'step': 3849, 'global_step': 3849} [2025-08-15 10:05:50] {'loss': '0.6992', 'loss_video': '0.2751', 'loss_audio': '0.4241', 'step': 3859, 'global_step': 3859} [2025-08-15 10:08:22] {'loss': '0.6385', 'loss_video': '0.2451', 'loss_audio': '0.3934', 'step': 3869, 'global_step': 3869} [2025-08-15 10:10:47] {'loss': '0.6769', 'loss_video': '0.2574', 'loss_audio': '0.4195', 'step': 3879, 'global_step': 3879} [2025-08-15 10:13:42] {'loss': '0.6646', 'loss_video': '0.2466', 'loss_audio': '0.4180', 'step': 3889, 'global_step': 3889} [2025-08-15 10:16:12] {'loss': '0.6737', 'loss_video': '0.2695', 'loss_audio': '0.4042', 'step': 3899, 'global_step': 3899} [2025-08-15 10:18:33] {'loss': '0.6806', 'loss_video': '0.2842', 'loss_audio': '0.3965', 'step': 3909, 'global_step': 3909} [2025-08-15 10:20:52] {'loss': '0.6625', 'loss_video': '0.2520', 'loss_audio': '0.4106', 'step': 3919, 'global_step': 3919} [2025-08-15 10:23:26] {'loss': '0.7008', 'loss_video': '0.2918', 'loss_audio': '0.4090', 'step': 3929, 'global_step': 3929} [2025-08-15 10:25:58] {'loss': '0.7180', 'loss_video': '0.2583', 'loss_audio': '0.4597', 'step': 3939, 'global_step': 3939} [2025-08-15 10:28:31] {'loss': '0.6626', 'loss_video': '0.2354', 'loss_audio': '0.4271', 'step': 3949, 'global_step': 3949} [2025-08-15 10:30:57] {'loss': '0.7261', 'loss_video': '0.2556', 'loss_audio': '0.4705', 'step': 3959, 'global_step': 3959} [2025-08-15 10:33:48] {'loss': '0.7020', 'loss_video': '0.2708', 'loss_audio': '0.4312', 'step': 3969, 'global_step': 3969} [2025-08-15 10:36:07] {'loss': '0.6551', 'loss_video': '0.2581', 'loss_audio': '0.3970', 'step': 3979, 'global_step': 3979} [2025-08-15 10:38:45] {'loss': '0.7154', 'loss_video': '0.2867', 'loss_audio': '0.4287', 'step': 3989, 'global_step': 3989} [2025-08-15 10:41:31] {'loss': '0.6278', 'loss_video': '0.2291', 'loss_audio': '0.3988', 'step': 3999, 'global_step': 3999} [2025-08-15 10:41:37] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 10:41:55] Saved checkpoint at epoch 0, step 4000, global_step 4000 to ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step4000 [2025-08-15 10:41:55] ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step3500 has been deleted successfully as cfg.save_total_limit! [2025-08-15 10:44:25] {'loss': '0.6683', 'loss_video': '0.2465', 'loss_audio': '0.4217', 'step': 4009, 'global_step': 4009} [2025-08-15 10:46:56] {'loss': '0.6296', 'loss_video': '0.2534', 'loss_audio': '0.3762', 'step': 4019, 'global_step': 4019} [2025-08-15 10:49:11] {'loss': '0.6994', 'loss_video': '0.2630', 'loss_audio': '0.4363', 'step': 4029, 'global_step': 4029} [2025-08-15 10:51:37] {'loss': '0.7394', 'loss_video': '0.2715', 'loss_audio': '0.4679', 'step': 4039, 'global_step': 4039} [2025-08-15 10:53:55] {'loss': '0.6982', 'loss_video': '0.2633', 'loss_audio': '0.4349', 'step': 4049, 'global_step': 4049} [2025-08-15 10:56:15] {'loss': '0.6991', 'loss_video': '0.2683', 'loss_audio': '0.4308', 'step': 4059, 'global_step': 4059} [2025-08-15 10:58:35] {'loss': '0.6533', 'loss_video': '0.2444', 'loss_audio': '0.4089', 'step': 4069, 'global_step': 4069} [2025-08-15 11:00:57] {'loss': '0.6581', 'loss_video': '0.2697', 'loss_audio': '0.3883', 'step': 4079, 'global_step': 4079} [2025-08-15 11:03:24] {'loss': '0.7607', 'loss_video': '0.2967', 'loss_audio': '0.4640', 'step': 4089, 'global_step': 4089} [2025-08-15 11:06:00] {'loss': '0.6643', 'loss_video': '0.2656', 'loss_audio': '0.3987', 'step': 4099, 'global_step': 4099} [2025-08-15 11:08:46] {'loss': '0.7291', 'loss_video': '0.2974', 'loss_audio': '0.4317', 'step': 4109, 'global_step': 4109} [2025-08-15 11:11:29] {'loss': '0.7321', 'loss_video': '0.2958', 'loss_audio': '0.4363', 'step': 4119, 'global_step': 4119} [2025-08-15 11:13:43] {'loss': '0.6412', 'loss_video': '0.2442', 'loss_audio': '0.3970', 'step': 4129, 'global_step': 4129} [2025-08-15 11:16:11] {'loss': '0.6565', 'loss_video': '0.2601', 'loss_audio': '0.3964', 'step': 4139, 'global_step': 4139} [2025-08-15 11:18:14] {'loss': '0.7039', 'loss_video': '0.2750', 'loss_audio': '0.4289', 'step': 4149, 'global_step': 4149} [2025-08-15 11:20:51] {'loss': '0.6839', 'loss_video': '0.2520', 'loss_audio': '0.4319', 'step': 4159, 'global_step': 4159} [2025-08-15 11:23:14] {'loss': '0.6563', 'loss_video': '0.2546', 'loss_audio': '0.4017', 'step': 4169, 'global_step': 4169} [2025-08-15 11:25:41] {'loss': '0.6625', 'loss_video': '0.2645', 'loss_audio': '0.3980', 'step': 4179, 'global_step': 4179} [2025-08-15 11:27:53] {'loss': '0.6462', 'loss_video': '0.2582', 'loss_audio': '0.3880', 'step': 4189, 'global_step': 4189} [2025-08-15 11:30:10] {'loss': '0.7311', 'loss_video': '0.2695', 'loss_audio': '0.4616', 'step': 4199, 'global_step': 4199} [2025-08-15 11:32:23] {'loss': '0.6610', 'loss_video': '0.2634', 'loss_audio': '0.3976', 'step': 4209, 'global_step': 4209} [2025-08-15 11:34:29] {'loss': '0.7051', 'loss_video': '0.2531', 'loss_audio': '0.4520', 'step': 4219, 'global_step': 4219} [2025-08-15 11:36:33] {'loss': '0.6545', 'loss_video': '0.2507', 'loss_audio': '0.4037', 'step': 4229, 'global_step': 4229} [2025-08-15 11:38:45] {'loss': '0.7202', 'loss_video': '0.2775', 'loss_audio': '0.4427', 'step': 4239, 'global_step': 4239} [2025-08-15 11:41:19] {'loss': '0.7063', 'loss_video': '0.2905', 'loss_audio': '0.4158', 'step': 4249, 'global_step': 4249} [2025-08-15 11:41:25] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 11:41:41] Saved checkpoint at epoch 0, step 4250, global_step 4250 to ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step4250 [2025-08-15 11:41:42] ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step3750 has been deleted successfully as cfg.save_total_limit! [2025-08-15 11:44:34] {'loss': '0.7363', 'loss_video': '0.2884', 'loss_audio': '0.4479', 'step': 4259, 'global_step': 4259} [2025-08-15 11:46:59] {'loss': '0.6882', 'loss_video': '0.2712', 'loss_audio': '0.4170', 'step': 4269, 'global_step': 4269} [2025-08-15 11:49:14] {'loss': '0.7038', 'loss_video': '0.2778', 'loss_audio': '0.4260', 'step': 4279, 'global_step': 4279} [2025-08-15 11:51:44] {'loss': '0.7884', 'loss_video': '0.3129', 'loss_audio': '0.4754', 'step': 4289, 'global_step': 4289} [2025-08-15 11:54:32] {'loss': '0.6867', 'loss_video': '0.2606', 'loss_audio': '0.4261', 'step': 4299, 'global_step': 4299} [2025-08-15 11:57:12] {'loss': '0.7127', 'loss_video': '0.2740', 'loss_audio': '0.4387', 'step': 4309, 'global_step': 4309} [2025-08-15 11:59:30] {'loss': '0.6503', 'loss_video': '0.2498', 'loss_audio': '0.4005', 'step': 4319, 'global_step': 4319} [2025-08-15 12:02:15] {'loss': '0.6948', 'loss_video': '0.2735', 'loss_audio': '0.4213', 'step': 4329, 'global_step': 4329} [2025-08-15 12:04:38] {'loss': '0.7031', 'loss_video': '0.2841', 'loss_audio': '0.4190', 'step': 4339, 'global_step': 4339} [2025-08-15 12:07:23] {'loss': '0.7093', 'loss_video': '0.2723', 'loss_audio': '0.4370', 'step': 4349, 'global_step': 4349} [2025-08-15 12:09:46] {'loss': '0.6245', 'loss_video': '0.2117', 'loss_audio': '0.4128', 'step': 4359, 'global_step': 4359} [2025-08-15 12:12:13] {'loss': '0.7171', 'loss_video': '0.2783', 'loss_audio': '0.4388', 'step': 4369, 'global_step': 4369} [2025-08-15 12:15:00] {'loss': '0.7002', 'loss_video': '0.2608', 'loss_audio': '0.4395', 'step': 4379, 'global_step': 4379} [2025-08-15 12:17:29] {'loss': '0.6805', 'loss_video': '0.2611', 'loss_audio': '0.4194', 'step': 4389, 'global_step': 4389} [2025-08-15 12:19:45] {'loss': '0.7857', 'loss_video': '0.2895', 'loss_audio': '0.4962', 'step': 4399, 'global_step': 4399} [2025-08-15 12:22:01] {'loss': '0.7007', 'loss_video': '0.2575', 'loss_audio': '0.4432', 'step': 4409, 'global_step': 4409} [2025-08-15 12:24:42] {'loss': '0.6932', 'loss_video': '0.2565', 'loss_audio': '0.4367', 'step': 4419, 'global_step': 4419} [2025-08-15 12:27:02] {'loss': '0.6862', 'loss_video': '0.2793', 'loss_audio': '0.4070', 'step': 4429, 'global_step': 4429} [2025-08-15 12:29:32] {'loss': '0.7617', 'loss_video': '0.2981', 'loss_audio': '0.4636', 'step': 4439, 'global_step': 4439} [2025-08-15 12:32:09] {'loss': '0.7339', 'loss_video': '0.2666', 'loss_audio': '0.4673', 'step': 4449, 'global_step': 4449} [2025-08-15 12:34:23] {'loss': '0.6774', 'loss_video': '0.2564', 'loss_audio': '0.4210', 'step': 4459, 'global_step': 4459} [2025-08-15 12:36:42] {'loss': '0.6894', 'loss_video': '0.2507', 'loss_audio': '0.4388', 'step': 4469, 'global_step': 4469} [2025-08-15 12:39:13] {'loss': '0.6682', 'loss_video': '0.2547', 'loss_audio': '0.4135', 'step': 4479, 'global_step': 4479} [2025-08-15 12:42:05] {'loss': '0.6768', 'loss_video': '0.2434', 'loss_audio': '0.4335', 'step': 4489, 'global_step': 4489} [2025-08-15 12:44:36] {'loss': '0.6703', 'loss_video': '0.2342', 'loss_audio': '0.4361', 'step': 4499, 'global_step': 4499} [2025-08-15 12:44:42] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 12:45:00] Saved checkpoint at epoch 0, step 4500, global_step 4500 to ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step4500 [2025-08-15 12:45:00] ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step4000 has been deleted successfully as cfg.save_total_limit! [2025-08-15 12:47:03] {'loss': '0.6805', 'loss_video': '0.2524', 'loss_audio': '0.4281', 'step': 4509, 'global_step': 4509} [2025-08-15 12:49:30] {'loss': '0.6700', 'loss_video': '0.2680', 'loss_audio': '0.4020', 'step': 4519, 'global_step': 4519} [2025-08-15 12:52:06] {'loss': '0.6670', 'loss_video': '0.2806', 'loss_audio': '0.3863', 'step': 4529, 'global_step': 4529} [2025-08-15 12:54:37] {'loss': '0.7177', 'loss_video': '0.2652', 'loss_audio': '0.4526', 'step': 4539, 'global_step': 4539} [2025-08-15 12:56:47] {'loss': '0.7049', 'loss_video': '0.2791', 'loss_audio': '0.4258', 'step': 4549, 'global_step': 4549} [2025-08-15 12:59:25] {'loss': '0.7294', 'loss_video': '0.2410', 'loss_audio': '0.4884', 'step': 4559, 'global_step': 4559} [2025-08-15 13:02:09] {'loss': '0.7007', 'loss_video': '0.2749', 'loss_audio': '0.4258', 'step': 4569, 'global_step': 4569} [2025-08-15 13:04:37] {'loss': '0.6761', 'loss_video': '0.2568', 'loss_audio': '0.4193', 'step': 4579, 'global_step': 4579} [2025-08-15 13:07:11] {'loss': '0.6645', 'loss_video': '0.2600', 'loss_audio': '0.4044', 'step': 4589, 'global_step': 4589} [2025-08-15 13:09:34] {'loss': '0.6241', 'loss_video': '0.2287', 'loss_audio': '0.3954', 'step': 4599, 'global_step': 4599} [2025-08-15 13:12:18] {'loss': '0.7725', 'loss_video': '0.2930', 'loss_audio': '0.4796', 'step': 4609, 'global_step': 4609} [2025-08-15 13:14:34] {'loss': '0.6661', 'loss_video': '0.2563', 'loss_audio': '0.4098', 'step': 4619, 'global_step': 4619} [2025-08-15 13:17:15] {'loss': '0.6758', 'loss_video': '0.2526', 'loss_audio': '0.4232', 'step': 4629, 'global_step': 4629} [2025-08-15 13:19:40] {'loss': '0.6689', 'loss_video': '0.2496', 'loss_audio': '0.4193', 'step': 4639, 'global_step': 4639} [2025-08-15 13:22:18] {'loss': '0.6593', 'loss_video': '0.2364', 'loss_audio': '0.4229', 'step': 4649, 'global_step': 4649} [2025-08-15 13:24:52] {'loss': '0.7217', 'loss_video': '0.2895', 'loss_audio': '0.4323', 'step': 4659, 'global_step': 4659} [2025-08-15 13:27:02] {'loss': '0.6582', 'loss_video': '0.2404', 'loss_audio': '0.4178', 'step': 4669, 'global_step': 4669} [2025-08-15 13:29:37] {'loss': '0.6769', 'loss_video': '0.2692', 'loss_audio': '0.4077', 'step': 4679, 'global_step': 4679} [2025-08-15 13:32:19] {'loss': '0.6735', 'loss_video': '0.2697', 'loss_audio': '0.4038', 'step': 4689, 'global_step': 4689} [2025-08-15 13:34:57] {'loss': '0.6504', 'loss_video': '0.2465', 'loss_audio': '0.4040', 'step': 4699, 'global_step': 4699} [2025-08-15 13:37:15] {'loss': '0.6860', 'loss_video': '0.2621', 'loss_audio': '0.4239', 'step': 4709, 'global_step': 4709} [2025-08-15 13:39:39] {'loss': '0.6623', 'loss_video': '0.2402', 'loss_audio': '0.4221', 'step': 4719, 'global_step': 4719} [2025-08-15 13:41:54] {'loss': '0.6736', 'loss_video': '0.2420', 'loss_audio': '0.4316', 'step': 4729, 'global_step': 4729} [2025-08-15 13:44:23] {'loss': '0.6114', 'loss_video': '0.2205', 'loss_audio': '0.3909', 'step': 4739, 'global_step': 4739} [2025-08-15 13:46:36] {'loss': '0.6328', 'loss_video': '0.2293', 'loss_audio': '0.4034', 'step': 4749, 'global_step': 4749} [2025-08-15 13:46:42] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 13:46:59] Saved checkpoint at epoch 0, step 4750, global_step 4750 to ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step4750 [2025-08-15 13:46:59] ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step4250 has been deleted successfully as cfg.save_total_limit! [2025-08-15 13:49:31] {'loss': '0.6496', 'loss_video': '0.2352', 'loss_audio': '0.4144', 'step': 4759, 'global_step': 4759} [2025-08-15 13:51:56] {'loss': '0.7374', 'loss_video': '0.2909', 'loss_audio': '0.4465', 'step': 4769, 'global_step': 4769} [2025-08-15 13:54:10] {'loss': '0.6775', 'loss_video': '0.2659', 'loss_audio': '0.4116', 'step': 4779, 'global_step': 4779} [2025-08-15 13:56:53] {'loss': '0.6974', 'loss_video': '0.2839', 'loss_audio': '0.4135', 'step': 4789, 'global_step': 4789} [2025-08-15 13:59:03] {'loss': '0.6827', 'loss_video': '0.2470', 'loss_audio': '0.4356', 'step': 4799, 'global_step': 4799} [2025-08-15 14:01:34] {'loss': '0.6732', 'loss_video': '0.2506', 'loss_audio': '0.4226', 'step': 4809, 'global_step': 4809} [2025-08-15 14:04:22] {'loss': '0.6570', 'loss_video': '0.2565', 'loss_audio': '0.4005', 'step': 4819, 'global_step': 4819} [2025-08-15 14:16:29] {'loss': '0.6792', 'loss_video': '0.2570', 'loss_audio': '0.4223', 'step': 4829, 'global_step': 4829} [2025-08-15 14:23:20] Experiment directory created at ./outputs/audio_video/001-Wan2_1_T2V_1_3B [2025-08-15 14:23:20] Training configuration: {'adam_eps': 1e-15, 'aes': None, 'audio_cfg': {'augmentation': {'mixup': 0.0}, 'preprocessing': {'audio': {'duration': 10.24, 'max_wav_value': 32768.0, 'sampling_rate': 16000, 'scale_factor': 8}, 'mel': {'mel_fmax': 8000, 'mel_fmin': 0, 'n_mel_channels': 64}, 'stft': {'filter_length': 1024, 'hop_length': 160, 'win_length': 1024}}}, 'audio_vae': {'from_pretrained': './checkpoints/audioldm2', 'type': 'AudioLDM2'}, 'audio_weight_path': 'exps/audio/dual_ffn_no_attnlora/epoch017-global_step75000', 'bucket_config': {'240p': {33: ((1.0, 1.0), 16), 49: ((1.0, 0.4), 12), 65: ((1.0, 0.3), 12), 81: ((1.0, 0.2), 10)}, '360p': {33: ((0.5, 0.5), 8), 49: ((0.5, 0.3), 6), 65: ((0.5, 0.2), 6), 81: ((0.5, 0.2), 5)}, '480p': {33: ((0.5, 0.3), 5), 49: ((1.0, 0.2), 4), 65: ((1.0, 0.2), 4), 81: ((1.0, 0.1), 3)}}, 'ckpt_every': 250, 'config': 'configs/wan2.1/train/stage2_audio_video.py', 'dataset': {'audio_transform_name': 'mel_spec_audioldm2', 'data_path': 'debug/meta/TAVGBench_train_140k.csv', 'default_video_fps': 16, 'direct_load_video_cli': True, 'scale_factor': 16, 'transform_name': 'resize_crop', 'type': 'VariableVideoAudioTextDataset', 'use_audio_in_video': True}, 'dtype': 'bf16', 'ema_decay': 0.99, 'epochs': 10, 'flow': None, 'grad_checkpoint': True, 'grad_clip': 1.0, 'load': './outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step4750', 'load_text_features': False, 'log_every': 10, 'lora_alpha': 256, 'lora_dropout': 0, 'lora_enabled': True, 'lora_r': 128, 'lora_target_modules': ['self_attn.q', 'self_attn.k', 'self_attn.v', 'self_attn.o', 'cross_attn.q', 'cross_attn.k', 'cross_attn.v', 'cross_attn.o'], 'lr': 0.0001, 'mel_bins': 64, 'model': {'audio_in_dim': 8, 'audio_out_dim': 8, 'audio_patch_size': (2, 2), 'audio_special_token': False, 'class_drop_prob': 0.1, 'cross_attn_norm': True, 'dim': 1536, 'dual_ffn': True, 'ffn_dim': 8960, 'freq_dim': 256, 'init_from_video_branch': False, 'model_type': 't2av', 'num_heads': 12, 'num_layers': 30, 'patch_size': (1, 2, 2), 'qk_norm': True, 'train_audio_specific_blocks': False, 'type': 'Wan2_1_T2V_1_3B', 'weight_init_from': ['./checkpoints/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors', 'exps/audio/dual_ffn_no_attnlora/epoch017-global_step75000'], 'window_size': (-1, -1)}, 'neg_prompt': '色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走,低音质,差音质,最差音质,噪音,失真的,破音,削波失真,数字瑕疵,声音故障,不自然的,刺耳的,尖锐的,底噪,过多混响,过多回声,突兀的剪辑,不自然的淡出,录音质量差,业余录音', 'num_bucket_build_workers': 8, 'num_workers': 16, 'outputs': './outputs/audio_video', 'plugin': 'zero2', 'port': 29500, 'record_time': False, 'sampling_rate': 16000, 'save_total_limit': 2, 'scheduler': {'num_sampling_steps': 50, 'transform_scale': 5.0, 'type': 'rflow', 'use_timestep_transform': True}, 'seed': 42, 'start_from_scratch': False, 'text_encoder': {'from_pretrained': './checkpoints/Wan2.1-T2V-1.3B', 't5_checkpoint': 'models_t5_umt5-xxl-enc-bf16.pth', 't5_tokenizer': 'google/umt5-xxl', 'text_len': 512, 'type': 'Wan2_1_T2V_1_3B_t5_umt5'}, 'vae': {'from_pretrained': './checkpoints/Wan2.1-T2V-1.3B', 'type': 'Wan2_1_T2V_1_3B_VAE', 'vae_checkpoint': 'Wan2.1_VAE.pth', 'vae_stride': (4, 8, 8)}, 'video_weight_path': './checkpoints/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors', 'wandb': False, 'warmup_steps': 1000} [2025-08-15 14:23:20] Building dataset... [2025-08-15 14:23:21] Dataset contains 140048 samples. [2025-08-15 14:23:21] Number of buckets: 204 [2025-08-15 14:23:21] Building buckets... [2025-08-15 14:23:24] Bucket Info: [2025-08-15 14:23:24] Bucket [#sample, #batch] by aspect ratio: {'0.38': [73, 6], '0.43': [269, 38], '0.48': [48, 2], '0.50': [82, 7], '0.53': [165, 21], '0.54': [578, 72], '0.56': [94859, 16895], '0.62': [844, 129], '0.67': [2354, 317], '0.75': [34023, 3483], '1.00': [303, 27], '1.33': [268, 22], '1.50': [76, 5], '1.78': [870, 90]} [2025-08-15 14:23:24] Image Bucket [#sample, #batch] by HxWxT: {} [2025-08-15 14:23:24] Video Bucket [#sample, #batch] by HxWxT: {('480p', 81): [8932, 2972], ('480p', 65): [16251, 4058], ('480p', 49): [13020, 3250], ('480p', 33): [7810, 1557], ('360p', 81): [6942, 1384], ('360p', 65): [5609, 930], ('360p', 49): [6592, 1093], ('360p', 33): [7835, 973], ('240p', 81): [12433, 1237], ('240p', 65): [14710, 1219], ('240p', 49): [13809, 1145], ('240p', 33): [20869, 1296]} [2025-08-15 14:23:24] #training batch: 20.62 K, #training sample: 131.65 K, #non empty bucket: 164 [2025-08-15 14:23:24] Building models... [2025-08-15 14:23:25] loading ./checkpoints/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth [2025-08-15 14:23:35] loading ./checkpoints/Wan2.1-T2V-1.3B/Wan2.1_VAE.pth [2025-08-15 14:23:38] AudioLDM2 text free. [2025-08-15 14:24:07] 825/982 keys loaded from ./checkpoints/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors. [2025-08-15 14:24:11] Model checkpoint loaded from exps/audio/dual_ffn_no_attnlora/epoch017-global_step75000 [2025-08-15 14:24:14] Trainable model params: 90.00 M, Total model params: 2.19 B [2025-08-15 14:24:15] Preparing for distributed training... [2025-08-15 14:24:15] Boosting model for distributed training [2025-08-15 14:24:15] Training for 10 epochs with 10557 steps per epoch [2025-08-15 14:24:15] Loading checkpoint [2025-08-15 14:24:26] Loaded checkpoint ./outputs/audio_video/000-Wan2_1_T2V_1_3B/epoch000-global_step4750 at epoch 0 step 4750 [2025-08-15 14:24:26] Using neg_prompt for classifier-free gudiance training: 色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走,低音质,差音质,最差音质,噪音,失真的,破音,削波失真,数字瑕疵,声音故障,不自然的,刺耳的,尖锐的,底噪,过多混响,过多回声,突兀的剪辑,不自然的淡出,录音质量差,业余录音 [2025-08-15 14:24:29] Beginning epoch 0... [2025-08-15 14:27:36] {'loss': '0.6885', 'loss_video': '0.2612', 'loss_audio': '0.4273', 'step': 4759, 'global_step': 4759} [2025-08-15 14:30:01] {'loss': '0.7140', 'loss_video': '0.2787', 'loss_audio': '0.4352', 'step': 4769, 'global_step': 4769} [2025-08-15 14:32:15] {'loss': '0.7005', 'loss_video': '0.2829', 'loss_audio': '0.4176', 'step': 4779, 'global_step': 4779} [2025-08-15 14:34:58] {'loss': '0.7328', 'loss_video': '0.2718', 'loss_audio': '0.4610', 'step': 4789, 'global_step': 4789} [2025-08-15 14:37:08] {'loss': '0.7450', 'loss_video': '0.2734', 'loss_audio': '0.4716', 'step': 4799, 'global_step': 4799} [2025-08-15 14:39:39] {'loss': '0.6504', 'loss_video': '0.2517', 'loss_audio': '0.3987', 'step': 4809, 'global_step': 4809} [2025-08-15 14:42:26] {'loss': '0.7451', 'loss_video': '0.3057', 'loss_audio': '0.4393', 'step': 4819, 'global_step': 4819} [2025-08-15 14:45:16] {'loss': '0.6721', 'loss_video': '0.2647', 'loss_audio': '0.4075', 'step': 4829, 'global_step': 4829} [2025-08-15 14:47:56] {'loss': '0.6809', 'loss_video': '0.2563', 'loss_audio': '0.4246', 'step': 4839, 'global_step': 4839} [2025-08-15 14:50:05] {'loss': '0.7185', 'loss_video': '0.2792', 'loss_audio': '0.4393', 'step': 4849, 'global_step': 4849} [2025-08-15 14:52:44] {'loss': '0.6594', 'loss_video': '0.2402', 'loss_audio': '0.4192', 'step': 4859, 'global_step': 4859} [2025-08-15 14:55:04] {'loss': '0.6714', 'loss_video': '0.2550', 'loss_audio': '0.4164', 'step': 4869, 'global_step': 4869} [2025-08-15 14:57:24] {'loss': '0.6559', 'loss_video': '0.2408', 'loss_audio': '0.4151', 'step': 4879, 'global_step': 4879} [2025-08-15 15:00:00] {'loss': '0.6743', 'loss_video': '0.2526', 'loss_audio': '0.4217', 'step': 4889, 'global_step': 4889} [2025-08-15 15:02:33] {'loss': '0.7183', 'loss_video': '0.2880', 'loss_audio': '0.4303', 'step': 4899, 'global_step': 4899} [2025-08-15 15:05:08] {'loss': '0.7507', 'loss_video': '0.2706', 'loss_audio': '0.4801', 'step': 4909, 'global_step': 4909} [2025-08-15 15:07:46] {'loss': '0.7040', 'loss_video': '0.2726', 'loss_audio': '0.4314', 'step': 4919, 'global_step': 4919} [2025-08-15 15:10:14] {'loss': '0.8139', 'loss_video': '0.3268', 'loss_audio': '0.4871', 'step': 4929, 'global_step': 4929} [2025-08-15 15:12:44] {'loss': '0.7651', 'loss_video': '0.3158', 'loss_audio': '0.4493', 'step': 4939, 'global_step': 4939} [2025-08-15 15:15:14] {'loss': '0.6429', 'loss_video': '0.2200', 'loss_audio': '0.4229', 'step': 4949, 'global_step': 4949} [2025-08-15 15:17:31] {'loss': '0.7048', 'loss_video': '0.2801', 'loss_audio': '0.4247', 'step': 4959, 'global_step': 4959} [2025-08-15 15:20:00] {'loss': '0.6481', 'loss_video': '0.2619', 'loss_audio': '0.3862', 'step': 4969, 'global_step': 4969} [2025-08-15 15:22:34] {'loss': '0.7211', 'loss_video': '0.2881', 'loss_audio': '0.4330', 'step': 4979, 'global_step': 4979} [2025-08-15 15:24:52] {'loss': '0.6524', 'loss_video': '0.2490', 'loss_audio': '0.4034', 'step': 4989, 'global_step': 4989} [2025-08-15 15:27:39] {'loss': '0.7555', 'loss_video': '0.2526', 'loss_audio': '0.5029', 'step': 4999, 'global_step': 4999} [2025-08-15 15:27:45] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 15:28:03] Saved checkpoint at epoch 0, step 5000, global_step 5000 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step5000 [2025-08-15 15:30:35] {'loss': '0.6778', 'loss_video': '0.2485', 'loss_audio': '0.4292', 'step': 5009, 'global_step': 5009} [2025-08-15 15:33:20] {'loss': '0.7166', 'loss_video': '0.2848', 'loss_audio': '0.4318', 'step': 5019, 'global_step': 5019} [2025-08-15 15:35:57] {'loss': '0.6647', 'loss_video': '0.2475', 'loss_audio': '0.4173', 'step': 5029, 'global_step': 5029} [2025-08-15 15:38:23] {'loss': '0.6712', 'loss_video': '0.2613', 'loss_audio': '0.4100', 'step': 5039, 'global_step': 5039} [2025-08-15 15:40:49] {'loss': '0.6500', 'loss_video': '0.2545', 'loss_audio': '0.3955', 'step': 5049, 'global_step': 5049} [2025-08-15 15:43:25] {'loss': '0.7029', 'loss_video': '0.2806', 'loss_audio': '0.4223', 'step': 5059, 'global_step': 5059} [2025-08-15 15:46:02] {'loss': '0.7228', 'loss_video': '0.2738', 'loss_audio': '0.4490', 'step': 5069, 'global_step': 5069} [2025-08-15 15:48:31] {'loss': '0.6962', 'loss_video': '0.2596', 'loss_audio': '0.4367', 'step': 5079, 'global_step': 5079} [2025-08-15 15:50:58] {'loss': '0.6777', 'loss_video': '0.2736', 'loss_audio': '0.4041', 'step': 5089, 'global_step': 5089} [2025-08-15 15:53:26] {'loss': '0.6974', 'loss_video': '0.2607', 'loss_audio': '0.4368', 'step': 5099, 'global_step': 5099} [2025-08-15 15:56:03] {'loss': '0.6245', 'loss_video': '0.2380', 'loss_audio': '0.3865', 'step': 5109, 'global_step': 5109} [2025-08-15 15:58:18] {'loss': '0.6776', 'loss_video': '0.2339', 'loss_audio': '0.4437', 'step': 5119, 'global_step': 5119} [2025-08-15 16:00:58] {'loss': '0.7023', 'loss_video': '0.2438', 'loss_audio': '0.4585', 'step': 5129, 'global_step': 5129} [2025-08-15 16:03:26] {'loss': '0.6571', 'loss_video': '0.2606', 'loss_audio': '0.3965', 'step': 5139, 'global_step': 5139} [2025-08-15 16:05:54] {'loss': '0.6407', 'loss_video': '0.2355', 'loss_audio': '0.4052', 'step': 5149, 'global_step': 5149} [2025-08-15 16:08:33] {'loss': '0.7001', 'loss_video': '0.2710', 'loss_audio': '0.4291', 'step': 5159, 'global_step': 5159} [2025-08-15 16:11:14] {'loss': '0.7026', 'loss_video': '0.2698', 'loss_audio': '0.4328', 'step': 5169, 'global_step': 5169} [2025-08-15 16:14:02] {'loss': '0.6615', 'loss_video': '0.2544', 'loss_audio': '0.4071', 'step': 5179, 'global_step': 5179} [2025-08-15 16:16:20] {'loss': '0.6735', 'loss_video': '0.2523', 'loss_audio': '0.4212', 'step': 5189, 'global_step': 5189} [2025-08-15 16:19:06] {'loss': '0.6500', 'loss_video': '0.2371', 'loss_audio': '0.4129', 'step': 5199, 'global_step': 5199} [2025-08-15 16:21:58] {'loss': '0.6582', 'loss_video': '0.2659', 'loss_audio': '0.3923', 'step': 5209, 'global_step': 5209} [2025-08-15 16:24:31] {'loss': '0.6872', 'loss_video': '0.2729', 'loss_audio': '0.4143', 'step': 5219, 'global_step': 5219} [2025-08-15 16:26:41] {'loss': '0.6831', 'loss_video': '0.2604', 'loss_audio': '0.4226', 'step': 5229, 'global_step': 5229} [2025-08-15 16:28:54] {'loss': '0.6549', 'loss_video': '0.2351', 'loss_audio': '0.4198', 'step': 5239, 'global_step': 5239} [2025-08-15 16:31:23] {'loss': '0.6661', 'loss_video': '0.2749', 'loss_audio': '0.3912', 'step': 5249, 'global_step': 5249} [2025-08-15 16:31:29] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 16:31:47] Saved checkpoint at epoch 0, step 5250, global_step 5250 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step5250 [2025-08-15 16:34:08] {'loss': '0.6946', 'loss_video': '0.2745', 'loss_audio': '0.4201', 'step': 5259, 'global_step': 5259} [2025-08-15 16:36:35] {'loss': '0.7562', 'loss_video': '0.3013', 'loss_audio': '0.4549', 'step': 5269, 'global_step': 5269} [2025-08-15 16:39:26] {'loss': '0.7060', 'loss_video': '0.2730', 'loss_audio': '0.4330', 'step': 5279, 'global_step': 5279} [2025-08-15 16:41:42] {'loss': '0.6493', 'loss_video': '0.2288', 'loss_audio': '0.4205', 'step': 5289, 'global_step': 5289} [2025-08-15 16:44:23] {'loss': '0.6676', 'loss_video': '0.2407', 'loss_audio': '0.4269', 'step': 5299, 'global_step': 5299} [2025-08-15 16:46:47] {'loss': '0.7126', 'loss_video': '0.2743', 'loss_audio': '0.4383', 'step': 5309, 'global_step': 5309} [2025-08-15 16:48:57] {'loss': '0.6965', 'loss_video': '0.2788', 'loss_audio': '0.4177', 'step': 5319, 'global_step': 5319} [2025-08-15 16:51:22] {'loss': '0.7787', 'loss_video': '0.2920', 'loss_audio': '0.4867', 'step': 5329, 'global_step': 5329} [2025-08-15 16:54:01] {'loss': '0.6669', 'loss_video': '0.2608', 'loss_audio': '0.4061', 'step': 5339, 'global_step': 5339} [2025-08-15 16:56:28] {'loss': '0.6930', 'loss_video': '0.2667', 'loss_audio': '0.4263', 'step': 5349, 'global_step': 5349} [2025-08-15 16:59:09] {'loss': '0.6806', 'loss_video': '0.2437', 'loss_audio': '0.4369', 'step': 5359, 'global_step': 5359} [2025-08-15 17:01:51] {'loss': '0.6990', 'loss_video': '0.2912', 'loss_audio': '0.4078', 'step': 5369, 'global_step': 5369} [2025-08-15 17:04:28] {'loss': '0.6233', 'loss_video': '0.2394', 'loss_audio': '0.3839', 'step': 5379, 'global_step': 5379} [2025-08-15 17:07:06] {'loss': '0.6335', 'loss_video': '0.2300', 'loss_audio': '0.4035', 'step': 5389, 'global_step': 5389} [2025-08-15 17:09:31] {'loss': '0.7282', 'loss_video': '0.2969', 'loss_audio': '0.4313', 'step': 5399, 'global_step': 5399} [2025-08-15 17:11:51] {'loss': '0.6851', 'loss_video': '0.2487', 'loss_audio': '0.4365', 'step': 5409, 'global_step': 5409} [2025-08-15 17:14:04] {'loss': '0.6246', 'loss_video': '0.2292', 'loss_audio': '0.3954', 'step': 5419, 'global_step': 5419} [2025-08-15 17:16:17] {'loss': '0.6697', 'loss_video': '0.2830', 'loss_audio': '0.3867', 'step': 5429, 'global_step': 5429} [2025-08-15 17:19:01] {'loss': '0.6881', 'loss_video': '0.2709', 'loss_audio': '0.4172', 'step': 5439, 'global_step': 5439} [2025-08-15 17:21:37] {'loss': '0.6464', 'loss_video': '0.2445', 'loss_audio': '0.4019', 'step': 5449, 'global_step': 5449} [2025-08-15 17:23:45] {'loss': '0.6662', 'loss_video': '0.2494', 'loss_audio': '0.4168', 'step': 5459, 'global_step': 5459} [2025-08-15 17:26:15] {'loss': '0.6759', 'loss_video': '0.2487', 'loss_audio': '0.4271', 'step': 5469, 'global_step': 5469} [2025-08-15 17:28:54] {'loss': '0.6464', 'loss_video': '0.2529', 'loss_audio': '0.3935', 'step': 5479, 'global_step': 5479} [2025-08-15 17:31:10] {'loss': '0.6669', 'loss_video': '0.2489', 'loss_audio': '0.4181', 'step': 5489, 'global_step': 5489} [2025-08-15 17:33:40] {'loss': '0.6565', 'loss_video': '0.2542', 'loss_audio': '0.4023', 'step': 5499, 'global_step': 5499} [2025-08-15 17:33:46] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 17:34:03] Saved checkpoint at epoch 0, step 5500, global_step 5500 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step5500 [2025-08-15 17:34:04] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step5000 has been deleted successfully as cfg.save_total_limit! [2025-08-15 17:36:01] {'loss': '0.6928', 'loss_video': '0.2779', 'loss_audio': '0.4149', 'step': 5509, 'global_step': 5509} [2025-08-15 17:38:30] {'loss': '0.6576', 'loss_video': '0.2549', 'loss_audio': '0.4027', 'step': 5519, 'global_step': 5519} [2025-08-15 17:40:41] {'loss': '0.6733', 'loss_video': '0.2641', 'loss_audio': '0.4092', 'step': 5529, 'global_step': 5529} [2025-08-15 17:43:01] {'loss': '0.6433', 'loss_video': '0.2335', 'loss_audio': '0.4097', 'step': 5539, 'global_step': 5539} [2025-08-15 17:45:39] {'loss': '0.6493', 'loss_video': '0.2441', 'loss_audio': '0.4052', 'step': 5549, 'global_step': 5549} [2025-08-15 17:48:13] {'loss': '0.6914', 'loss_video': '0.2571', 'loss_audio': '0.4343', 'step': 5559, 'global_step': 5559} [2025-08-15 17:50:54] {'loss': '0.6786', 'loss_video': '0.2397', 'loss_audio': '0.4389', 'step': 5569, 'global_step': 5569} [2025-08-15 17:53:10] {'loss': '0.7365', 'loss_video': '0.2800', 'loss_audio': '0.4565', 'step': 5579, 'global_step': 5579} [2025-08-15 17:55:33] {'loss': '0.6730', 'loss_video': '0.2569', 'loss_audio': '0.4161', 'step': 5589, 'global_step': 5589} [2025-08-15 17:58:14] {'loss': '0.7688', 'loss_video': '0.2897', 'loss_audio': '0.4791', 'step': 5599, 'global_step': 5599} [2025-08-15 18:00:56] {'loss': '0.6632', 'loss_video': '0.2438', 'loss_audio': '0.4194', 'step': 5609, 'global_step': 5609} [2025-08-15 18:03:36] {'loss': '0.6626', 'loss_video': '0.2598', 'loss_audio': '0.4027', 'step': 5619, 'global_step': 5619} [2025-08-15 18:05:56] {'loss': '0.6486', 'loss_video': '0.2397', 'loss_audio': '0.4089', 'step': 5629, 'global_step': 5629} [2025-08-15 18:08:30] {'loss': '0.6550', 'loss_video': '0.2479', 'loss_audio': '0.4070', 'step': 5639, 'global_step': 5639} [2025-08-15 18:10:50] {'loss': '0.6724', 'loss_video': '0.2457', 'loss_audio': '0.4267', 'step': 5649, 'global_step': 5649} [2025-08-15 18:13:01] {'loss': '0.7154', 'loss_video': '0.2741', 'loss_audio': '0.4412', 'step': 5659, 'global_step': 5659} [2025-08-15 18:15:10] {'loss': '0.6183', 'loss_video': '0.2355', 'loss_audio': '0.3828', 'step': 5669, 'global_step': 5669} [2025-08-15 18:17:36] {'loss': '0.6862', 'loss_video': '0.2517', 'loss_audio': '0.4345', 'step': 5679, 'global_step': 5679} [2025-08-15 18:20:14] {'loss': '0.7367', 'loss_video': '0.3028', 'loss_audio': '0.4338', 'step': 5689, 'global_step': 5689} [2025-08-15 18:23:01] {'loss': '0.6827', 'loss_video': '0.2490', 'loss_audio': '0.4337', 'step': 5699, 'global_step': 5699} [2025-08-15 18:25:26] {'loss': '0.6475', 'loss_video': '0.2463', 'loss_audio': '0.4012', 'step': 5709, 'global_step': 5709} [2025-08-15 18:27:52] {'loss': '0.6484', 'loss_video': '0.2480', 'loss_audio': '0.4004', 'step': 5719, 'global_step': 5719} [2025-08-15 18:30:09] {'loss': '0.6658', 'loss_video': '0.2717', 'loss_audio': '0.3941', 'step': 5729, 'global_step': 5729} [2025-08-15 18:32:32] {'loss': '0.6119', 'loss_video': '0.2415', 'loss_audio': '0.3704', 'step': 5739, 'global_step': 5739} [2025-08-15 18:35:01] {'loss': '0.6307', 'loss_video': '0.2393', 'loss_audio': '0.3914', 'step': 5749, 'global_step': 5749} [2025-08-15 18:35:08] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 18:35:25] Saved checkpoint at epoch 0, step 5750, global_step 5750 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step5750 [2025-08-15 18:35:25] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step5250 has been deleted successfully as cfg.save_total_limit! [2025-08-15 18:38:13] {'loss': '0.6461', 'loss_video': '0.2438', 'loss_audio': '0.4023', 'step': 5759, 'global_step': 5759} [2025-08-15 18:40:15] {'loss': '0.6876', 'loss_video': '0.2623', 'loss_audio': '0.4253', 'step': 5769, 'global_step': 5769} [2025-08-15 18:42:42] {'loss': '0.7260', 'loss_video': '0.2705', 'loss_audio': '0.4555', 'step': 5779, 'global_step': 5779} [2025-08-15 18:45:25] {'loss': '0.6656', 'loss_video': '0.2509', 'loss_audio': '0.4147', 'step': 5789, 'global_step': 5789} [2025-08-15 18:47:49] {'loss': '0.6328', 'loss_video': '0.2453', 'loss_audio': '0.3875', 'step': 5799, 'global_step': 5799} [2025-08-15 18:50:19] {'loss': '0.7324', 'loss_video': '0.2823', 'loss_audio': '0.4501', 'step': 5809, 'global_step': 5809} [2025-08-15 18:52:57] {'loss': '0.6871', 'loss_video': '0.2889', 'loss_audio': '0.3982', 'step': 5819, 'global_step': 5819} [2025-08-15 18:55:39] {'loss': '0.7706', 'loss_video': '0.3288', 'loss_audio': '0.4417', 'step': 5829, 'global_step': 5829} [2025-08-15 18:58:08] {'loss': '0.6798', 'loss_video': '0.2641', 'loss_audio': '0.4157', 'step': 5839, 'global_step': 5839} [2025-08-15 19:00:31] {'loss': '0.6932', 'loss_video': '0.2585', 'loss_audio': '0.4347', 'step': 5849, 'global_step': 5849} [2025-08-15 19:03:00] {'loss': '0.6686', 'loss_video': '0.2665', 'loss_audio': '0.4020', 'step': 5859, 'global_step': 5859} [2025-08-15 19:05:36] {'loss': '0.7251', 'loss_video': '0.2606', 'loss_audio': '0.4646', 'step': 5869, 'global_step': 5869} [2025-08-15 19:08:19] {'loss': '0.6807', 'loss_video': '0.2470', 'loss_audio': '0.4337', 'step': 5879, 'global_step': 5879} [2025-08-15 19:11:01] {'loss': '0.6853', 'loss_video': '0.2639', 'loss_audio': '0.4214', 'step': 5889, 'global_step': 5889} [2025-08-15 19:13:31] {'loss': '0.7209', 'loss_video': '0.2775', 'loss_audio': '0.4434', 'step': 5899, 'global_step': 5899} [2025-08-15 19:15:56] {'loss': '0.7124', 'loss_video': '0.3067', 'loss_audio': '0.4058', 'step': 5909, 'global_step': 5909} [2025-08-15 19:18:19] {'loss': '0.7095', 'loss_video': '0.2842', 'loss_audio': '0.4254', 'step': 5919, 'global_step': 5919} [2025-08-15 19:20:58] {'loss': '0.7093', 'loss_video': '0.3088', 'loss_audio': '0.4005', 'step': 5929, 'global_step': 5929} [2025-08-15 19:23:27] {'loss': '0.6804', 'loss_video': '0.2620', 'loss_audio': '0.4184', 'step': 5939, 'global_step': 5939} [2025-08-15 19:25:48] {'loss': '0.6737', 'loss_video': '0.2807', 'loss_audio': '0.3930', 'step': 5949, 'global_step': 5949} [2025-08-15 19:28:18] {'loss': '0.6743', 'loss_video': '0.2484', 'loss_audio': '0.4259', 'step': 5959, 'global_step': 5959} [2025-08-15 19:30:54] {'loss': '0.6826', 'loss_video': '0.2405', 'loss_audio': '0.4420', 'step': 5969, 'global_step': 5969} [2025-08-15 19:33:19] {'loss': '0.7433', 'loss_video': '0.2909', 'loss_audio': '0.4524', 'step': 5979, 'global_step': 5979} [2025-08-15 19:35:46] {'loss': '0.6814', 'loss_video': '0.2858', 'loss_audio': '0.3957', 'step': 5989, 'global_step': 5989} [2025-08-15 19:38:06] {'loss': '0.7378', 'loss_video': '0.2902', 'loss_audio': '0.4476', 'step': 5999, 'global_step': 5999} [2025-08-15 19:38:12] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 19:38:29] Saved checkpoint at epoch 0, step 6000, global_step 6000 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step6000 [2025-08-15 19:38:29] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step5500 has been deleted successfully as cfg.save_total_limit! [2025-08-15 19:41:07] {'loss': '0.6637', 'loss_video': '0.2329', 'loss_audio': '0.4308', 'step': 6009, 'global_step': 6009} [2025-08-15 19:43:35] {'loss': '0.6647', 'loss_video': '0.2544', 'loss_audio': '0.4103', 'step': 6019, 'global_step': 6019} [2025-08-15 19:46:06] {'loss': '0.6360', 'loss_video': '0.2517', 'loss_audio': '0.3843', 'step': 6029, 'global_step': 6029} [2025-08-15 19:48:33] {'loss': '0.6589', 'loss_video': '0.2572', 'loss_audio': '0.4017', 'step': 6039, 'global_step': 6039} [2025-08-15 19:51:10] {'loss': '0.8133', 'loss_video': '0.3210', 'loss_audio': '0.4923', 'step': 6049, 'global_step': 6049} [2025-08-15 19:53:32] {'loss': '0.6754', 'loss_video': '0.2552', 'loss_audio': '0.4201', 'step': 6059, 'global_step': 6059} [2025-08-15 19:55:32] {'loss': '0.6694', 'loss_video': '0.2567', 'loss_audio': '0.4127', 'step': 6069, 'global_step': 6069} [2025-08-15 19:58:24] {'loss': '0.7013', 'loss_video': '0.2774', 'loss_audio': '0.4239', 'step': 6079, 'global_step': 6079} [2025-08-15 20:00:54] {'loss': '0.6520', 'loss_video': '0.2459', 'loss_audio': '0.4061', 'step': 6089, 'global_step': 6089} [2025-08-15 20:03:27] {'loss': '0.6753', 'loss_video': '0.2763', 'loss_audio': '0.3990', 'step': 6099, 'global_step': 6099} [2025-08-15 20:05:57] {'loss': '0.6951', 'loss_video': '0.2813', 'loss_audio': '0.4138', 'step': 6109, 'global_step': 6109} [2025-08-15 20:08:18] {'loss': '0.6322', 'loss_video': '0.2399', 'loss_audio': '0.3924', 'step': 6119, 'global_step': 6119} [2025-08-15 20:11:03] {'loss': '0.6370', 'loss_video': '0.2353', 'loss_audio': '0.4017', 'step': 6129, 'global_step': 6129} [2025-08-15 20:13:30] {'loss': '0.6856', 'loss_video': '0.2659', 'loss_audio': '0.4197', 'step': 6139, 'global_step': 6139} [2025-08-15 20:15:30] {'loss': '0.6977', 'loss_video': '0.2456', 'loss_audio': '0.4521', 'step': 6149, 'global_step': 6149} [2025-08-15 20:18:06] {'loss': '0.6540', 'loss_video': '0.2226', 'loss_audio': '0.4314', 'step': 6159, 'global_step': 6159} [2025-08-15 20:20:33] {'loss': '0.6792', 'loss_video': '0.2672', 'loss_audio': '0.4120', 'step': 6169, 'global_step': 6169} [2025-08-15 20:22:50] {'loss': '0.6820', 'loss_video': '0.2500', 'loss_audio': '0.4320', 'step': 6179, 'global_step': 6179} [2025-08-15 20:25:19] {'loss': '0.6599', 'loss_video': '0.2799', 'loss_audio': '0.3799', 'step': 6189, 'global_step': 6189} [2025-08-15 20:27:38] {'loss': '0.7032', 'loss_video': '0.2606', 'loss_audio': '0.4426', 'step': 6199, 'global_step': 6199} [2025-08-15 20:30:03] {'loss': '0.7349', 'loss_video': '0.2831', 'loss_audio': '0.4518', 'step': 6209, 'global_step': 6209} [2025-08-15 20:32:35] {'loss': '0.7418', 'loss_video': '0.2789', 'loss_audio': '0.4630', 'step': 6219, 'global_step': 6219} [2025-08-15 20:34:47] {'loss': '0.6474', 'loss_video': '0.2245', 'loss_audio': '0.4230', 'step': 6229, 'global_step': 6229} [2025-08-15 20:37:11] {'loss': '0.6515', 'loss_video': '0.2450', 'loss_audio': '0.4064', 'step': 6239, 'global_step': 6239} [2025-08-15 20:39:39] {'loss': '0.7104', 'loss_video': '0.2751', 'loss_audio': '0.4353', 'step': 6249, 'global_step': 6249} [2025-08-15 20:39:45] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 20:40:02] Saved checkpoint at epoch 0, step 6250, global_step 6250 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step6250 [2025-08-15 20:40:02] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step5750 has been deleted successfully as cfg.save_total_limit! [2025-08-15 20:42:32] {'loss': '0.6886', 'loss_video': '0.2483', 'loss_audio': '0.4403', 'step': 6259, 'global_step': 6259} [2025-08-15 20:45:22] {'loss': '0.7102', 'loss_video': '0.2724', 'loss_audio': '0.4378', 'step': 6269, 'global_step': 6269} [2025-08-15 20:47:38] {'loss': '0.7158', 'loss_video': '0.2367', 'loss_audio': '0.4792', 'step': 6279, 'global_step': 6279} [2025-08-15 20:50:11] {'loss': '0.7489', 'loss_video': '0.2666', 'loss_audio': '0.4823', 'step': 6289, 'global_step': 6289} [2025-08-15 20:52:46] {'loss': '0.7224', 'loss_video': '0.2674', 'loss_audio': '0.4550', 'step': 6299, 'global_step': 6299} [2025-08-15 20:54:57] {'loss': '0.7211', 'loss_video': '0.2869', 'loss_audio': '0.4342', 'step': 6309, 'global_step': 6309} [2025-08-15 20:57:31] {'loss': '0.6713', 'loss_video': '0.2646', 'loss_audio': '0.4067', 'step': 6319, 'global_step': 6319} [2025-08-15 21:00:07] {'loss': '0.7433', 'loss_video': '0.3027', 'loss_audio': '0.4407', 'step': 6329, 'global_step': 6329} [2025-08-15 21:02:17] {'loss': '0.6288', 'loss_video': '0.2338', 'loss_audio': '0.3950', 'step': 6339, 'global_step': 6339} [2025-08-15 21:05:07] {'loss': '0.7236', 'loss_video': '0.3042', 'loss_audio': '0.4194', 'step': 6349, 'global_step': 6349} [2025-08-15 21:07:43] {'loss': '0.6854', 'loss_video': '0.2589', 'loss_audio': '0.4266', 'step': 6359, 'global_step': 6359} [2025-08-15 21:10:11] {'loss': '0.7456', 'loss_video': '0.3036', 'loss_audio': '0.4420', 'step': 6369, 'global_step': 6369} [2025-08-15 21:12:54] {'loss': '0.6642', 'loss_video': '0.2573', 'loss_audio': '0.4068', 'step': 6379, 'global_step': 6379} [2025-08-15 21:15:20] {'loss': '0.6331', 'loss_video': '0.2402', 'loss_audio': '0.3929', 'step': 6389, 'global_step': 6389} [2025-08-15 21:17:48] {'loss': '0.7571', 'loss_video': '0.2832', 'loss_audio': '0.4739', 'step': 6399, 'global_step': 6399} [2025-08-15 21:20:18] {'loss': '0.6443', 'loss_video': '0.2371', 'loss_audio': '0.4071', 'step': 6409, 'global_step': 6409} [2025-08-15 21:22:58] {'loss': '0.7361', 'loss_video': '0.2923', 'loss_audio': '0.4438', 'step': 6419, 'global_step': 6419} [2025-08-15 21:25:16] {'loss': '0.6456', 'loss_video': '0.2519', 'loss_audio': '0.3937', 'step': 6429, 'global_step': 6429} [2025-08-15 21:27:51] {'loss': '0.7045', 'loss_video': '0.2676', 'loss_audio': '0.4369', 'step': 6439, 'global_step': 6439} [2025-08-15 21:30:17] {'loss': '0.6633', 'loss_video': '0.2420', 'loss_audio': '0.4213', 'step': 6449, 'global_step': 6449} [2025-08-15 21:33:04] {'loss': '0.6608', 'loss_video': '0.2541', 'loss_audio': '0.4066', 'step': 6459, 'global_step': 6459} [2025-08-15 21:35:25] {'loss': '0.6714', 'loss_video': '0.2804', 'loss_audio': '0.3910', 'step': 6469, 'global_step': 6469} [2025-08-15 21:37:49] {'loss': '0.6768', 'loss_video': '0.2660', 'loss_audio': '0.4108', 'step': 6479, 'global_step': 6479} [2025-08-15 21:40:32] {'loss': '0.6614', 'loss_video': '0.2365', 'loss_audio': '0.4249', 'step': 6489, 'global_step': 6489} [2025-08-15 21:42:54] {'loss': '0.6806', 'loss_video': '0.2739', 'loss_audio': '0.4067', 'step': 6499, 'global_step': 6499} [2025-08-15 21:43:00] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 21:43:18] Saved checkpoint at epoch 0, step 6500, global_step 6500 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step6500 [2025-08-15 21:43:18] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step6000 has been deleted successfully as cfg.save_total_limit! [2025-08-15 21:45:37] {'loss': '0.7135', 'loss_video': '0.2796', 'loss_audio': '0.4340', 'step': 6509, 'global_step': 6509} [2025-08-15 21:48:16] {'loss': '0.7450', 'loss_video': '0.2872', 'loss_audio': '0.4578', 'step': 6519, 'global_step': 6519} [2025-08-15 21:50:54] {'loss': '0.6745', 'loss_video': '0.2625', 'loss_audio': '0.4120', 'step': 6529, 'global_step': 6529} [2025-08-15 21:53:21] {'loss': '0.7795', 'loss_video': '0.2962', 'loss_audio': '0.4832', 'step': 6539, 'global_step': 6539} [2025-08-15 21:55:46] {'loss': '0.6370', 'loss_video': '0.2610', 'loss_audio': '0.3761', 'step': 6549, 'global_step': 6549} [2025-08-15 21:58:21] {'loss': '0.7220', 'loss_video': '0.2745', 'loss_audio': '0.4474', 'step': 6559, 'global_step': 6559} [2025-08-15 22:00:59] {'loss': '0.7426', 'loss_video': '0.2781', 'loss_audio': '0.4646', 'step': 6569, 'global_step': 6569} [2025-08-15 22:03:31] {'loss': '0.6720', 'loss_video': '0.2673', 'loss_audio': '0.4047', 'step': 6579, 'global_step': 6579} [2025-08-15 22:05:55] {'loss': '0.6256', 'loss_video': '0.2354', 'loss_audio': '0.3902', 'step': 6589, 'global_step': 6589} [2025-08-15 22:08:18] {'loss': '0.6864', 'loss_video': '0.2668', 'loss_audio': '0.4196', 'step': 6599, 'global_step': 6599} [2025-08-15 22:10:57] {'loss': '0.6457', 'loss_video': '0.2474', 'loss_audio': '0.3983', 'step': 6609, 'global_step': 6609} [2025-08-15 22:13:45] {'loss': '0.7681', 'loss_video': '0.2849', 'loss_audio': '0.4832', 'step': 6619, 'global_step': 6619} [2025-08-15 22:16:24] {'loss': '0.6895', 'loss_video': '0.2632', 'loss_audio': '0.4263', 'step': 6629, 'global_step': 6629} [2025-08-15 22:18:56] {'loss': '0.6834', 'loss_video': '0.2689', 'loss_audio': '0.4145', 'step': 6639, 'global_step': 6639} [2025-08-15 22:21:18] {'loss': '0.5921', 'loss_video': '0.2238', 'loss_audio': '0.3683', 'step': 6649, 'global_step': 6649} [2025-08-15 22:23:46] {'loss': '0.6420', 'loss_video': '0.2587', 'loss_audio': '0.3833', 'step': 6659, 'global_step': 6659} [2025-08-15 22:26:14] {'loss': '0.6586', 'loss_video': '0.2611', 'loss_audio': '0.3975', 'step': 6669, 'global_step': 6669} [2025-08-15 22:28:53] {'loss': '0.6713', 'loss_video': '0.2720', 'loss_audio': '0.3993', 'step': 6679, 'global_step': 6679} [2025-08-15 22:31:36] {'loss': '0.6836', 'loss_video': '0.2577', 'loss_audio': '0.4259', 'step': 6689, 'global_step': 6689} [2025-08-15 22:33:42] {'loss': '0.7280', 'loss_video': '0.2826', 'loss_audio': '0.4454', 'step': 6699, 'global_step': 6699} [2025-08-15 22:36:17] {'loss': '0.7416', 'loss_video': '0.2799', 'loss_audio': '0.4617', 'step': 6709, 'global_step': 6709} [2025-08-15 22:38:43] {'loss': '0.7302', 'loss_video': '0.2713', 'loss_audio': '0.4589', 'step': 6719, 'global_step': 6719} [2025-08-15 22:41:28] {'loss': '0.7087', 'loss_video': '0.2791', 'loss_audio': '0.4295', 'step': 6729, 'global_step': 6729} [2025-08-15 22:43:55] {'loss': '0.6556', 'loss_video': '0.2357', 'loss_audio': '0.4199', 'step': 6739, 'global_step': 6739} [2025-08-15 22:46:17] {'loss': '0.6468', 'loss_video': '0.2551', 'loss_audio': '0.3917', 'step': 6749, 'global_step': 6749} [2025-08-15 22:46:24] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 22:46:41] Saved checkpoint at epoch 0, step 6750, global_step 6750 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step6750 [2025-08-15 22:46:41] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step6250 has been deleted successfully as cfg.save_total_limit! [2025-08-15 22:49:05] {'loss': '0.6957', 'loss_video': '0.2887', 'loss_audio': '0.4070', 'step': 6759, 'global_step': 6759} [2025-08-15 22:51:30] {'loss': '0.6753', 'loss_video': '0.2728', 'loss_audio': '0.4025', 'step': 6769, 'global_step': 6769} [2025-08-15 22:53:57] {'loss': '0.6294', 'loss_video': '0.2378', 'loss_audio': '0.3916', 'step': 6779, 'global_step': 6779} [2025-08-15 22:56:28] {'loss': '0.6388', 'loss_video': '0.2427', 'loss_audio': '0.3960', 'step': 6789, 'global_step': 6789} [2025-08-15 22:58:35] {'loss': '0.7360', 'loss_video': '0.2879', 'loss_audio': '0.4481', 'step': 6799, 'global_step': 6799} [2025-08-15 23:00:50] {'loss': '0.6581', 'loss_video': '0.2348', 'loss_audio': '0.4233', 'step': 6809, 'global_step': 6809} [2025-08-15 23:03:13] {'loss': '0.7193', 'loss_video': '0.2755', 'loss_audio': '0.4438', 'step': 6819, 'global_step': 6819} [2025-08-15 23:05:39] {'loss': '0.7260', 'loss_video': '0.2805', 'loss_audio': '0.4456', 'step': 6829, 'global_step': 6829} [2025-08-15 23:08:12] {'loss': '0.6513', 'loss_video': '0.2579', 'loss_audio': '0.3935', 'step': 6839, 'global_step': 6839} [2025-08-15 23:10:44] {'loss': '0.6665', 'loss_video': '0.2488', 'loss_audio': '0.4177', 'step': 6849, 'global_step': 6849} [2025-08-15 23:13:16] {'loss': '0.7205', 'loss_video': '0.2813', 'loss_audio': '0.4392', 'step': 6859, 'global_step': 6859} [2025-08-15 23:15:24] {'loss': '0.7015', 'loss_video': '0.2605', 'loss_audio': '0.4410', 'step': 6869, 'global_step': 6869} [2025-08-15 23:17:56] {'loss': '0.6729', 'loss_video': '0.2554', 'loss_audio': '0.4176', 'step': 6879, 'global_step': 6879} [2025-08-15 23:20:29] {'loss': '0.7399', 'loss_video': '0.3312', 'loss_audio': '0.4086', 'step': 6889, 'global_step': 6889} [2025-08-15 23:23:10] {'loss': '0.7209', 'loss_video': '0.2705', 'loss_audio': '0.4504', 'step': 6899, 'global_step': 6899} [2025-08-15 23:25:46] {'loss': '0.6910', 'loss_video': '0.2992', 'loss_audio': '0.3919', 'step': 6909, 'global_step': 6909} [2025-08-15 23:28:07] {'loss': '0.6271', 'loss_video': '0.2519', 'loss_audio': '0.3752', 'step': 6919, 'global_step': 6919} [2025-08-15 23:30:42] {'loss': '0.6921', 'loss_video': '0.2741', 'loss_audio': '0.4180', 'step': 6929, 'global_step': 6929} [2025-08-15 23:33:00] {'loss': '0.7077', 'loss_video': '0.2666', 'loss_audio': '0.4411', 'step': 6939, 'global_step': 6939} [2025-08-15 23:35:20] {'loss': '0.7599', 'loss_video': '0.2896', 'loss_audio': '0.4704', 'step': 6949, 'global_step': 6949} [2025-08-15 23:37:25] {'loss': '0.6645', 'loss_video': '0.2369', 'loss_audio': '0.4276', 'step': 6959, 'global_step': 6959} [2025-08-15 23:39:40] {'loss': '0.7173', 'loss_video': '0.2554', 'loss_audio': '0.4619', 'step': 6969, 'global_step': 6969} [2025-08-15 23:42:07] {'loss': '0.7462', 'loss_video': '0.2747', 'loss_audio': '0.4715', 'step': 6979, 'global_step': 6979} [2025-08-15 23:44:36] {'loss': '0.7176', 'loss_video': '0.2696', 'loss_audio': '0.4480', 'step': 6989, 'global_step': 6989} [2025-08-15 23:47:16] {'loss': '0.7354', 'loss_video': '0.2964', 'loss_audio': '0.4389', 'step': 6999, 'global_step': 6999} [2025-08-15 23:47:22] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-15 23:47:39] Saved checkpoint at epoch 0, step 7000, global_step 7000 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step7000 [2025-08-15 23:47:40] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step6500 has been deleted successfully as cfg.save_total_limit! [2025-08-15 23:49:45] {'loss': '0.7071', 'loss_video': '0.2607', 'loss_audio': '0.4464', 'step': 7009, 'global_step': 7009} [2025-08-15 23:51:57] {'loss': '0.6983', 'loss_video': '0.2177', 'loss_audio': '0.4806', 'step': 7019, 'global_step': 7019} [2025-08-15 23:54:24] {'loss': '0.6972', 'loss_video': '0.2405', 'loss_audio': '0.4567', 'step': 7029, 'global_step': 7029} [2025-08-15 23:57:00] {'loss': '0.7108', 'loss_video': '0.2669', 'loss_audio': '0.4439', 'step': 7039, 'global_step': 7039} [2025-08-15 23:59:24] {'loss': '0.7055', 'loss_video': '0.2711', 'loss_audio': '0.4344', 'step': 7049, 'global_step': 7049} [2025-08-16 00:01:49] {'loss': '0.7154', 'loss_video': '0.3030', 'loss_audio': '0.4124', 'step': 7059, 'global_step': 7059} [2025-08-16 00:04:11] {'loss': '0.6532', 'loss_video': '0.2552', 'loss_audio': '0.3980', 'step': 7069, 'global_step': 7069} [2025-08-16 00:06:56] {'loss': '0.7270', 'loss_video': '0.2917', 'loss_audio': '0.4353', 'step': 7079, 'global_step': 7079} [2025-08-16 00:09:39] {'loss': '0.6712', 'loss_video': '0.2508', 'loss_audio': '0.4205', 'step': 7089, 'global_step': 7089} [2025-08-16 00:11:35] {'loss': '0.6669', 'loss_video': '0.2317', 'loss_audio': '0.4352', 'step': 7099, 'global_step': 7099} [2025-08-16 00:14:13] {'loss': '0.7524', 'loss_video': '0.3098', 'loss_audio': '0.4426', 'step': 7109, 'global_step': 7109} [2025-08-16 00:16:32] {'loss': '0.6791', 'loss_video': '0.2579', 'loss_audio': '0.4211', 'step': 7119, 'global_step': 7119} [2025-08-16 00:18:36] {'loss': '0.7655', 'loss_video': '0.2828', 'loss_audio': '0.4826', 'step': 7129, 'global_step': 7129} [2025-08-16 00:21:11] {'loss': '0.6981', 'loss_video': '0.2658', 'loss_audio': '0.4323', 'step': 7139, 'global_step': 7139} [2025-08-16 00:23:55] {'loss': '0.7861', 'loss_video': '0.2992', 'loss_audio': '0.4869', 'step': 7149, 'global_step': 7149} [2025-08-16 00:26:19] {'loss': '0.6943', 'loss_video': '0.2684', 'loss_audio': '0.4259', 'step': 7159, 'global_step': 7159} [2025-08-16 00:28:39] {'loss': '0.6621', 'loss_video': '0.2633', 'loss_audio': '0.3988', 'step': 7169, 'global_step': 7169} [2025-08-16 00:31:06] {'loss': '0.6182', 'loss_video': '0.2242', 'loss_audio': '0.3940', 'step': 7179, 'global_step': 7179} [2025-08-16 00:33:26] {'loss': '0.6492', 'loss_video': '0.2475', 'loss_audio': '0.4017', 'step': 7189, 'global_step': 7189} [2025-08-16 00:35:43] {'loss': '0.7028', 'loss_video': '0.2666', 'loss_audio': '0.4362', 'step': 7199, 'global_step': 7199} [2025-08-16 00:38:03] {'loss': '0.6955', 'loss_video': '0.2815', 'loss_audio': '0.4139', 'step': 7209, 'global_step': 7209} [2025-08-16 00:40:34] {'loss': '0.6377', 'loss_video': '0.2138', 'loss_audio': '0.4239', 'step': 7219, 'global_step': 7219} [2025-08-16 00:42:57] {'loss': '0.6606', 'loss_video': '0.2584', 'loss_audio': '0.4023', 'step': 7229, 'global_step': 7229} [2025-08-16 00:45:15] {'loss': '0.6606', 'loss_video': '0.2488', 'loss_audio': '0.4117', 'step': 7239, 'global_step': 7239} [2025-08-16 00:47:35] {'loss': '0.6179', 'loss_video': '0.2218', 'loss_audio': '0.3960', 'step': 7249, 'global_step': 7249} [2025-08-16 00:47:41] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 00:47:58] Saved checkpoint at epoch 0, step 7250, global_step 7250 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step7250 [2025-08-16 00:47:59] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step6750 has been deleted successfully as cfg.save_total_limit! [2025-08-16 00:50:15] {'loss': '0.6410', 'loss_video': '0.2338', 'loss_audio': '0.4072', 'step': 7259, 'global_step': 7259} [2025-08-16 00:52:36] {'loss': '0.6621', 'loss_video': '0.2641', 'loss_audio': '0.3980', 'step': 7269, 'global_step': 7269} [2025-08-16 00:54:53] {'loss': '0.6644', 'loss_video': '0.2387', 'loss_audio': '0.4257', 'step': 7279, 'global_step': 7279} [2025-08-16 00:57:37] {'loss': '0.6395', 'loss_video': '0.2264', 'loss_audio': '0.4131', 'step': 7289, 'global_step': 7289} [2025-08-16 01:00:28] {'loss': '0.6679', 'loss_video': '0.2416', 'loss_audio': '0.4263', 'step': 7299, 'global_step': 7299} [2025-08-16 01:02:53] {'loss': '0.6712', 'loss_video': '0.2395', 'loss_audio': '0.4317', 'step': 7309, 'global_step': 7309} [2025-08-16 01:05:16] {'loss': '0.7224', 'loss_video': '0.2883', 'loss_audio': '0.4341', 'step': 7319, 'global_step': 7319} [2025-08-16 01:07:47] {'loss': '0.6890', 'loss_video': '0.2667', 'loss_audio': '0.4223', 'step': 7329, 'global_step': 7329} [2025-08-16 01:10:21] {'loss': '0.6985', 'loss_video': '0.2706', 'loss_audio': '0.4280', 'step': 7339, 'global_step': 7339} [2025-08-16 01:12:48] {'loss': '0.6542', 'loss_video': '0.2558', 'loss_audio': '0.3984', 'step': 7349, 'global_step': 7349} [2025-08-16 01:15:03] {'loss': '0.6849', 'loss_video': '0.2683', 'loss_audio': '0.4166', 'step': 7359, 'global_step': 7359} [2025-08-16 01:17:41] {'loss': '0.6981', 'loss_video': '0.2504', 'loss_audio': '0.4477', 'step': 7369, 'global_step': 7369} [2025-08-16 01:20:20] {'loss': '0.6800', 'loss_video': '0.2652', 'loss_audio': '0.4148', 'step': 7379, 'global_step': 7379} [2025-08-16 01:23:05] {'loss': '0.7396', 'loss_video': '0.2946', 'loss_audio': '0.4450', 'step': 7389, 'global_step': 7389} [2025-08-16 01:25:37] {'loss': '0.7259', 'loss_video': '0.2801', 'loss_audio': '0.4458', 'step': 7399, 'global_step': 7399} [2025-08-16 01:27:50] {'loss': '0.7250', 'loss_video': '0.2949', 'loss_audio': '0.4301', 'step': 7409, 'global_step': 7409} [2025-08-16 01:30:37] {'loss': '0.7190', 'loss_video': '0.2449', 'loss_audio': '0.4742', 'step': 7419, 'global_step': 7419} [2025-08-16 01:33:21] {'loss': '0.6422', 'loss_video': '0.2465', 'loss_audio': '0.3957', 'step': 7429, 'global_step': 7429} [2025-08-16 01:36:06] {'loss': '0.7490', 'loss_video': '0.2538', 'loss_audio': '0.4953', 'step': 7439, 'global_step': 7439} [2025-08-16 01:38:04] {'loss': '0.7043', 'loss_video': '0.2581', 'loss_audio': '0.4462', 'step': 7449, 'global_step': 7449} [2025-08-16 01:40:38] {'loss': '0.6782', 'loss_video': '0.2593', 'loss_audio': '0.4189', 'step': 7459, 'global_step': 7459} [2025-08-16 01:43:17] {'loss': '0.6714', 'loss_video': '0.2652', 'loss_audio': '0.4062', 'step': 7469, 'global_step': 7469} [2025-08-16 01:45:42] {'loss': '0.6530', 'loss_video': '0.2599', 'loss_audio': '0.3931', 'step': 7479, 'global_step': 7479} [2025-08-16 01:48:41] {'loss': '0.6348', 'loss_video': '0.2213', 'loss_audio': '0.4136', 'step': 7489, 'global_step': 7489} [2025-08-16 01:50:47] {'loss': '0.7115', 'loss_video': '0.3096', 'loss_audio': '0.4019', 'step': 7499, 'global_step': 7499} [2025-08-16 01:50:54] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 01:51:11] Saved checkpoint at epoch 0, step 7500, global_step 7500 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step7500 [2025-08-16 01:51:11] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step7000 has been deleted successfully as cfg.save_total_limit! [2025-08-16 01:53:43] {'loss': '0.6824', 'loss_video': '0.2709', 'loss_audio': '0.4115', 'step': 7509, 'global_step': 7509} [2025-08-16 01:56:01] {'loss': '0.6983', 'loss_video': '0.2666', 'loss_audio': '0.4317', 'step': 7519, 'global_step': 7519} [2025-08-16 01:58:30] {'loss': '0.7130', 'loss_video': '0.2802', 'loss_audio': '0.4328', 'step': 7529, 'global_step': 7529} [2025-08-16 02:01:15] {'loss': '0.7038', 'loss_video': '0.2887', 'loss_audio': '0.4151', 'step': 7539, 'global_step': 7539} [2025-08-16 02:03:49] {'loss': '0.6212', 'loss_video': '0.2546', 'loss_audio': '0.3666', 'step': 7549, 'global_step': 7549} [2025-08-16 02:06:25] {'loss': '0.6839', 'loss_video': '0.2480', 'loss_audio': '0.4359', 'step': 7559, 'global_step': 7559} [2025-08-16 02:09:08] {'loss': '0.7279', 'loss_video': '0.2857', 'loss_audio': '0.4422', 'step': 7569, 'global_step': 7569} [2025-08-16 02:11:35] {'loss': '0.6701', 'loss_video': '0.2376', 'loss_audio': '0.4326', 'step': 7579, 'global_step': 7579} [2025-08-16 02:14:16] {'loss': '0.6577', 'loss_video': '0.2487', 'loss_audio': '0.4089', 'step': 7589, 'global_step': 7589} [2025-08-16 02:16:53] {'loss': '0.7769', 'loss_video': '0.2826', 'loss_audio': '0.4943', 'step': 7599, 'global_step': 7599} [2025-08-16 02:19:19] {'loss': '0.6699', 'loss_video': '0.2861', 'loss_audio': '0.3838', 'step': 7609, 'global_step': 7609} [2025-08-16 02:22:04] {'loss': '0.6509', 'loss_video': '0.2655', 'loss_audio': '0.3854', 'step': 7619, 'global_step': 7619} [2025-08-16 02:24:38] {'loss': '0.6687', 'loss_video': '0.2740', 'loss_audio': '0.3948', 'step': 7629, 'global_step': 7629} [2025-08-16 02:26:57] {'loss': '0.6822', 'loss_video': '0.2620', 'loss_audio': '0.4202', 'step': 7639, 'global_step': 7639} [2025-08-16 02:29:22] {'loss': '0.6415', 'loss_video': '0.2610', 'loss_audio': '0.3805', 'step': 7649, 'global_step': 7649} [2025-08-16 02:31:43] {'loss': '0.7222', 'loss_video': '0.2716', 'loss_audio': '0.4506', 'step': 7659, 'global_step': 7659} [2025-08-16 02:34:02] {'loss': '0.6786', 'loss_video': '0.2574', 'loss_audio': '0.4212', 'step': 7669, 'global_step': 7669} [2025-08-16 02:36:17] {'loss': '0.6013', 'loss_video': '0.2172', 'loss_audio': '0.3841', 'step': 7679, 'global_step': 7679} [2025-08-16 02:38:30] {'loss': '0.6976', 'loss_video': '0.2912', 'loss_audio': '0.4065', 'step': 7689, 'global_step': 7689} [2025-08-16 02:41:09] {'loss': '0.6679', 'loss_video': '0.2529', 'loss_audio': '0.4150', 'step': 7699, 'global_step': 7699} [2025-08-16 02:43:26] {'loss': '0.6578', 'loss_video': '0.2434', 'loss_audio': '0.4144', 'step': 7709, 'global_step': 7709} [2025-08-16 02:45:51] {'loss': '0.6162', 'loss_video': '0.2279', 'loss_audio': '0.3883', 'step': 7719, 'global_step': 7719} [2025-08-16 02:48:22] {'loss': '0.7017', 'loss_video': '0.2497', 'loss_audio': '0.4519', 'step': 7729, 'global_step': 7729} [2025-08-16 02:50:47] {'loss': '0.6624', 'loss_video': '0.2589', 'loss_audio': '0.4035', 'step': 7739, 'global_step': 7739} [2025-08-16 02:53:13] {'loss': '0.6776', 'loss_video': '0.2811', 'loss_audio': '0.3965', 'step': 7749, 'global_step': 7749} [2025-08-16 02:53:19] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 02:53:36] Saved checkpoint at epoch 0, step 7750, global_step 7750 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step7750 [2025-08-16 02:53:36] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step7250 has been deleted successfully as cfg.save_total_limit! [2025-08-16 02:56:12] {'loss': '0.6448', 'loss_video': '0.2319', 'loss_audio': '0.4128', 'step': 7759, 'global_step': 7759} [2025-08-16 02:59:04] {'loss': '0.6791', 'loss_video': '0.2652', 'loss_audio': '0.4140', 'step': 7769, 'global_step': 7769} [2025-08-16 03:01:35] {'loss': '0.6714', 'loss_video': '0.2614', 'loss_audio': '0.4100', 'step': 7779, 'global_step': 7779} [2025-08-16 03:04:07] {'loss': '0.6781', 'loss_video': '0.2355', 'loss_audio': '0.4426', 'step': 7789, 'global_step': 7789} [2025-08-16 03:06:34] {'loss': '0.7568', 'loss_video': '0.3180', 'loss_audio': '0.4388', 'step': 7799, 'global_step': 7799} [2025-08-16 03:08:49] {'loss': '0.6971', 'loss_video': '0.2745', 'loss_audio': '0.4226', 'step': 7809, 'global_step': 7809} [2025-08-16 03:11:35] {'loss': '0.7036', 'loss_video': '0.2548', 'loss_audio': '0.4487', 'step': 7819, 'global_step': 7819} [2025-08-16 03:14:09] {'loss': '0.6323', 'loss_video': '0.2300', 'loss_audio': '0.4024', 'step': 7829, 'global_step': 7829} [2025-08-16 03:16:25] {'loss': '0.6974', 'loss_video': '0.2769', 'loss_audio': '0.4205', 'step': 7839, 'global_step': 7839} [2025-08-16 03:19:04] {'loss': '0.7425', 'loss_video': '0.2907', 'loss_audio': '0.4518', 'step': 7849, 'global_step': 7849} [2025-08-16 03:21:18] {'loss': '0.6596', 'loss_video': '0.2369', 'loss_audio': '0.4226', 'step': 7859, 'global_step': 7859} [2025-08-16 03:24:03] {'loss': '0.6091', 'loss_video': '0.2349', 'loss_audio': '0.3742', 'step': 7869, 'global_step': 7869} [2025-08-16 03:26:24] {'loss': '0.7203', 'loss_video': '0.2768', 'loss_audio': '0.4435', 'step': 7879, 'global_step': 7879} [2025-08-16 03:28:28] {'loss': '0.7056', 'loss_video': '0.2774', 'loss_audio': '0.4283', 'step': 7889, 'global_step': 7889} [2025-08-16 03:31:04] {'loss': '0.6276', 'loss_video': '0.2336', 'loss_audio': '0.3940', 'step': 7899, 'global_step': 7899} [2025-08-16 03:33:30] {'loss': '0.6844', 'loss_video': '0.2578', 'loss_audio': '0.4266', 'step': 7909, 'global_step': 7909} [2025-08-16 03:35:52] {'loss': '0.7447', 'loss_video': '0.3022', 'loss_audio': '0.4426', 'step': 7919, 'global_step': 7919} [2025-08-16 03:38:44] {'loss': '0.7243', 'loss_video': '0.2878', 'loss_audio': '0.4365', 'step': 7929, 'global_step': 7929} [2025-08-16 03:41:17] {'loss': '0.6912', 'loss_video': '0.2617', 'loss_audio': '0.4295', 'step': 7939, 'global_step': 7939} [2025-08-16 03:43:40] {'loss': '0.6800', 'loss_video': '0.2492', 'loss_audio': '0.4307', 'step': 7949, 'global_step': 7949} [2025-08-16 03:46:19] {'loss': '0.6009', 'loss_video': '0.2316', 'loss_audio': '0.3693', 'step': 7959, 'global_step': 7959} [2025-08-16 03:48:54] {'loss': '0.6706', 'loss_video': '0.2560', 'loss_audio': '0.4145', 'step': 7969, 'global_step': 7969} [2025-08-16 03:51:20] {'loss': '0.6861', 'loss_video': '0.2717', 'loss_audio': '0.4143', 'step': 7979, 'global_step': 7979} [2025-08-16 03:53:41] {'loss': '0.7269', 'loss_video': '0.2943', 'loss_audio': '0.4326', 'step': 7989, 'global_step': 7989} [2025-08-16 03:55:57] {'loss': '0.6858', 'loss_video': '0.2571', 'loss_audio': '0.4287', 'step': 7999, 'global_step': 7999} [2025-08-16 03:56:03] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 03:56:22] Saved checkpoint at epoch 0, step 8000, global_step 8000 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step8000 [2025-08-16 03:56:23] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step7500 has been deleted successfully as cfg.save_total_limit! [2025-08-16 03:58:38] {'loss': '0.6505', 'loss_video': '0.2407', 'loss_audio': '0.4098', 'step': 8009, 'global_step': 8009} [2025-08-16 04:00:49] {'loss': '0.7346', 'loss_video': '0.2879', 'loss_audio': '0.4468', 'step': 8019, 'global_step': 8019} [2025-08-16 04:03:00] {'loss': '0.6845', 'loss_video': '0.2482', 'loss_audio': '0.4363', 'step': 8029, 'global_step': 8029} [2025-08-16 04:05:34] {'loss': '0.6270', 'loss_video': '0.2400', 'loss_audio': '0.3870', 'step': 8039, 'global_step': 8039} [2025-08-16 04:07:58] {'loss': '0.6657', 'loss_video': '0.2810', 'loss_audio': '0.3847', 'step': 8049, 'global_step': 8049} [2025-08-16 04:10:35] {'loss': '0.6623', 'loss_video': '0.2505', 'loss_audio': '0.4118', 'step': 8059, 'global_step': 8059} [2025-08-16 04:13:01] {'loss': '0.6639', 'loss_video': '0.2749', 'loss_audio': '0.3890', 'step': 8069, 'global_step': 8069} [2025-08-16 04:15:40] {'loss': '0.6884', 'loss_video': '0.2818', 'loss_audio': '0.4066', 'step': 8079, 'global_step': 8079} [2025-08-16 04:18:03] {'loss': '0.6859', 'loss_video': '0.2627', 'loss_audio': '0.4232', 'step': 8089, 'global_step': 8089} [2025-08-16 04:20:26] {'loss': '0.6148', 'loss_video': '0.2228', 'loss_audio': '0.3920', 'step': 8099, 'global_step': 8099} [2025-08-16 04:23:02] {'loss': '0.6723', 'loss_video': '0.2491', 'loss_audio': '0.4232', 'step': 8109, 'global_step': 8109} [2025-08-16 04:25:31] {'loss': '0.6918', 'loss_video': '0.2694', 'loss_audio': '0.4224', 'step': 8119, 'global_step': 8119} [2025-08-16 04:28:00] {'loss': '0.6383', 'loss_video': '0.2561', 'loss_audio': '0.3823', 'step': 8129, 'global_step': 8129} [2025-08-16 04:30:38] {'loss': '0.6544', 'loss_video': '0.2373', 'loss_audio': '0.4171', 'step': 8139, 'global_step': 8139} [2025-08-16 04:32:58] {'loss': '0.6420', 'loss_video': '0.2284', 'loss_audio': '0.4136', 'step': 8149, 'global_step': 8149} [2025-08-16 04:35:36] {'loss': '0.6328', 'loss_video': '0.2374', 'loss_audio': '0.3955', 'step': 8159, 'global_step': 8159} [2025-08-16 04:38:06] {'loss': '0.6988', 'loss_video': '0.2810', 'loss_audio': '0.4179', 'step': 8169, 'global_step': 8169} [2025-08-16 04:40:28] {'loss': '0.6282', 'loss_video': '0.2250', 'loss_audio': '0.4032', 'step': 8179, 'global_step': 8179} [2025-08-16 04:42:47] {'loss': '0.6735', 'loss_video': '0.2731', 'loss_audio': '0.4004', 'step': 8189, 'global_step': 8189} [2025-08-16 04:45:10] {'loss': '0.6282', 'loss_video': '0.2410', 'loss_audio': '0.3872', 'step': 8199, 'global_step': 8199} [2025-08-16 04:47:32] {'loss': '0.7198', 'loss_video': '0.2957', 'loss_audio': '0.4242', 'step': 8209, 'global_step': 8209} [2025-08-16 04:49:57] {'loss': '0.6870', 'loss_video': '0.2821', 'loss_audio': '0.4049', 'step': 8219, 'global_step': 8219} [2025-08-16 04:52:44] {'loss': '0.6921', 'loss_video': '0.2697', 'loss_audio': '0.4224', 'step': 8229, 'global_step': 8229} [2025-08-16 04:55:17] {'loss': '0.6690', 'loss_video': '0.2593', 'loss_audio': '0.4097', 'step': 8239, 'global_step': 8239} [2025-08-16 04:57:50] {'loss': '0.7706', 'loss_video': '0.3003', 'loss_audio': '0.4703', 'step': 8249, 'global_step': 8249} [2025-08-16 04:57:56] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 04:58:14] Saved checkpoint at epoch 0, step 8250, global_step 8250 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step8250 [2025-08-16 04:58:14] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step7750 has been deleted successfully as cfg.save_total_limit! [2025-08-16 05:00:53] {'loss': '0.6362', 'loss_video': '0.2445', 'loss_audio': '0.3917', 'step': 8259, 'global_step': 8259} [2025-08-16 05:03:36] {'loss': '0.6462', 'loss_video': '0.2434', 'loss_audio': '0.4028', 'step': 8269, 'global_step': 8269} [2025-08-16 05:05:53] {'loss': '0.6637', 'loss_video': '0.2636', 'loss_audio': '0.4001', 'step': 8279, 'global_step': 8279} [2025-08-16 05:08:04] {'loss': '0.7001', 'loss_video': '0.2650', 'loss_audio': '0.4350', 'step': 8289, 'global_step': 8289} [2025-08-16 05:10:22] {'loss': '0.7144', 'loss_video': '0.2632', 'loss_audio': '0.4513', 'step': 8299, 'global_step': 8299} [2025-08-16 05:12:46] {'loss': '0.6692', 'loss_video': '0.2693', 'loss_audio': '0.3998', 'step': 8309, 'global_step': 8309} [2025-08-16 05:14:58] {'loss': '0.7013', 'loss_video': '0.2520', 'loss_audio': '0.4493', 'step': 8319, 'global_step': 8319} [2025-08-16 05:17:34] {'loss': '0.6654', 'loss_video': '0.2700', 'loss_audio': '0.3954', 'step': 8329, 'global_step': 8329} [2025-08-16 05:20:13] {'loss': '0.6972', 'loss_video': '0.2837', 'loss_audio': '0.4135', 'step': 8339, 'global_step': 8339} [2025-08-16 05:22:46] {'loss': '0.6999', 'loss_video': '0.2741', 'loss_audio': '0.4258', 'step': 8349, 'global_step': 8349} [2025-08-16 05:25:35] {'loss': '0.7163', 'loss_video': '0.3061', 'loss_audio': '0.4103', 'step': 8359, 'global_step': 8359} [2025-08-16 05:28:20] {'loss': '0.6927', 'loss_video': '0.2783', 'loss_audio': '0.4144', 'step': 8369, 'global_step': 8369} [2025-08-16 05:31:06] {'loss': '0.7189', 'loss_video': '0.3177', 'loss_audio': '0.4012', 'step': 8379, 'global_step': 8379} [2025-08-16 05:33:29] {'loss': '0.6719', 'loss_video': '0.2603', 'loss_audio': '0.4116', 'step': 8389, 'global_step': 8389} [2025-08-16 05:35:53] {'loss': '0.7218', 'loss_video': '0.2853', 'loss_audio': '0.4364', 'step': 8399, 'global_step': 8399} [2025-08-16 05:38:33] {'loss': '0.6362', 'loss_video': '0.2467', 'loss_audio': '0.3895', 'step': 8409, 'global_step': 8409} [2025-08-16 05:41:03] {'loss': '0.6390', 'loss_video': '0.2417', 'loss_audio': '0.3972', 'step': 8419, 'global_step': 8419} [2025-08-16 05:43:18] {'loss': '0.6766', 'loss_video': '0.2741', 'loss_audio': '0.4025', 'step': 8429, 'global_step': 8429} [2025-08-16 05:45:59] {'loss': '0.6889', 'loss_video': '0.2663', 'loss_audio': '0.4225', 'step': 8439, 'global_step': 8439} [2025-08-16 05:48:30] {'loss': '0.6779', 'loss_video': '0.2655', 'loss_audio': '0.4124', 'step': 8449, 'global_step': 8449} [2025-08-16 05:51:10] {'loss': '0.6983', 'loss_video': '0.2768', 'loss_audio': '0.4215', 'step': 8459, 'global_step': 8459} [2025-08-16 05:53:37] {'loss': '0.7335', 'loss_video': '0.2675', 'loss_audio': '0.4660', 'step': 8469, 'global_step': 8469} [2025-08-16 05:56:15] {'loss': '0.7360', 'loss_video': '0.3174', 'loss_audio': '0.4186', 'step': 8479, 'global_step': 8479} [2025-08-16 05:58:50] {'loss': '0.7221', 'loss_video': '0.2980', 'loss_audio': '0.4241', 'step': 8489, 'global_step': 8489} [2025-08-16 06:01:05] {'loss': '0.6637', 'loss_video': '0.2557', 'loss_audio': '0.4080', 'step': 8499, 'global_step': 8499} [2025-08-16 06:01:11] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 06:01:28] Saved checkpoint at epoch 0, step 8500, global_step 8500 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step8500 [2025-08-16 06:01:28] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step8000 has been deleted successfully as cfg.save_total_limit! [2025-08-16 06:03:40] {'loss': '0.7044', 'loss_video': '0.2672', 'loss_audio': '0.4372', 'step': 8509, 'global_step': 8509} [2025-08-16 06:06:05] {'loss': '0.7144', 'loss_video': '0.2572', 'loss_audio': '0.4572', 'step': 8519, 'global_step': 8519} [2025-08-16 06:08:29] {'loss': '0.6436', 'loss_video': '0.2337', 'loss_audio': '0.4100', 'step': 8529, 'global_step': 8529} [2025-08-16 06:11:01] {'loss': '0.6823', 'loss_video': '0.2491', 'loss_audio': '0.4332', 'step': 8539, 'global_step': 8539} [2025-08-16 06:13:37] {'loss': '0.7120', 'loss_video': '0.2747', 'loss_audio': '0.4373', 'step': 8549, 'global_step': 8549} [2025-08-16 06:16:03] {'loss': '0.6910', 'loss_video': '0.2590', 'loss_audio': '0.4320', 'step': 8559, 'global_step': 8559} [2025-08-16 06:18:16] {'loss': '0.7175', 'loss_video': '0.2949', 'loss_audio': '0.4225', 'step': 8569, 'global_step': 8569} [2025-08-16 06:20:53] {'loss': '0.6329', 'loss_video': '0.2386', 'loss_audio': '0.3942', 'step': 8579, 'global_step': 8579} [2025-08-16 06:23:17] {'loss': '0.6555', 'loss_video': '0.2454', 'loss_audio': '0.4101', 'step': 8589, 'global_step': 8589} [2025-08-16 06:25:37] {'loss': '0.6358', 'loss_video': '0.2464', 'loss_audio': '0.3894', 'step': 8599, 'global_step': 8599} [2025-08-16 06:28:07] {'loss': '0.6441', 'loss_video': '0.2432', 'loss_audio': '0.4009', 'step': 8609, 'global_step': 8609} [2025-08-16 06:30:49] {'loss': '0.6754', 'loss_video': '0.2240', 'loss_audio': '0.4514', 'step': 8619, 'global_step': 8619} [2025-08-16 06:33:25] {'loss': '0.7220', 'loss_video': '0.2788', 'loss_audio': '0.4432', 'step': 8629, 'global_step': 8629} [2025-08-16 06:35:44] {'loss': '0.6478', 'loss_video': '0.2349', 'loss_audio': '0.4128', 'step': 8639, 'global_step': 8639} [2025-08-16 06:37:57] {'loss': '0.6506', 'loss_video': '0.2218', 'loss_audio': '0.4289', 'step': 8649, 'global_step': 8649} [2025-08-16 06:40:18] {'loss': '0.7244', 'loss_video': '0.2647', 'loss_audio': '0.4597', 'step': 8659, 'global_step': 8659} [2025-08-16 06:42:57] {'loss': '0.7343', 'loss_video': '0.3118', 'loss_audio': '0.4225', 'step': 8669, 'global_step': 8669} [2025-08-16 06:45:17] {'loss': '0.6637', 'loss_video': '0.2550', 'loss_audio': '0.4087', 'step': 8679, 'global_step': 8679} [2025-08-16 06:47:42] {'loss': '0.6674', 'loss_video': '0.2497', 'loss_audio': '0.4177', 'step': 8689, 'global_step': 8689} [2025-08-16 06:49:52] {'loss': '0.6852', 'loss_video': '0.2609', 'loss_audio': '0.4243', 'step': 8699, 'global_step': 8699} [2025-08-16 06:52:02] {'loss': '0.6287', 'loss_video': '0.2472', 'loss_audio': '0.3815', 'step': 8709, 'global_step': 8709} [2025-08-16 06:54:16] {'loss': '0.7027', 'loss_video': '0.2824', 'loss_audio': '0.4202', 'step': 8719, 'global_step': 8719} [2025-08-16 06:56:29] {'loss': '0.6246', 'loss_video': '0.2520', 'loss_audio': '0.3726', 'step': 8729, 'global_step': 8729} [2025-08-16 06:58:51] {'loss': '0.6593', 'loss_video': '0.2549', 'loss_audio': '0.4045', 'step': 8739, 'global_step': 8739} [2025-08-16 07:00:46] {'loss': '0.6769', 'loss_video': '0.2747', 'loss_audio': '0.4022', 'step': 8749, 'global_step': 8749} [2025-08-16 07:00:53] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 07:01:10] Saved checkpoint at epoch 0, step 8750, global_step 8750 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step8750 [2025-08-16 07:01:10] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step8250 has been deleted successfully as cfg.save_total_limit! [2025-08-16 07:03:27] {'loss': '0.6538', 'loss_video': '0.2623', 'loss_audio': '0.3915', 'step': 8759, 'global_step': 8759} [2025-08-16 07:06:08] {'loss': '0.6779', 'loss_video': '0.2325', 'loss_audio': '0.4454', 'step': 8769, 'global_step': 8769} [2025-08-16 07:08:14] {'loss': '0.7317', 'loss_video': '0.2601', 'loss_audio': '0.4717', 'step': 8779, 'global_step': 8779} [2025-08-16 07:10:35] {'loss': '0.7410', 'loss_video': '0.2721', 'loss_audio': '0.4690', 'step': 8789, 'global_step': 8789} [2025-08-16 07:13:03] {'loss': '0.6570', 'loss_video': '0.2475', 'loss_audio': '0.4095', 'step': 8799, 'global_step': 8799} [2025-08-16 07:15:26] {'loss': '0.6724', 'loss_video': '0.2550', 'loss_audio': '0.4175', 'step': 8809, 'global_step': 8809} [2025-08-16 07:17:54] {'loss': '0.6788', 'loss_video': '0.2668', 'loss_audio': '0.4121', 'step': 8819, 'global_step': 8819} [2025-08-16 07:19:59] {'loss': '0.7442', 'loss_video': '0.2724', 'loss_audio': '0.4718', 'step': 8829, 'global_step': 8829} [2025-08-16 07:22:29] {'loss': '0.7114', 'loss_video': '0.2933', 'loss_audio': '0.4181', 'step': 8839, 'global_step': 8839} [2025-08-16 07:25:15] {'loss': '0.6944', 'loss_video': '0.2615', 'loss_audio': '0.4329', 'step': 8849, 'global_step': 8849} [2025-08-16 07:27:39] {'loss': '0.7957', 'loss_video': '0.2920', 'loss_audio': '0.5037', 'step': 8859, 'global_step': 8859} [2025-08-16 07:30:36] {'loss': '0.7012', 'loss_video': '0.2845', 'loss_audio': '0.4167', 'step': 8869, 'global_step': 8869} [2025-08-16 07:32:44] {'loss': '0.6781', 'loss_video': '0.2571', 'loss_audio': '0.4209', 'step': 8879, 'global_step': 8879} [2025-08-16 07:35:10] {'loss': '0.6324', 'loss_video': '0.2303', 'loss_audio': '0.4020', 'step': 8889, 'global_step': 8889} [2025-08-16 07:37:41] {'loss': '0.6556', 'loss_video': '0.2403', 'loss_audio': '0.4153', 'step': 8899, 'global_step': 8899} [2025-08-16 07:40:12] {'loss': '0.6915', 'loss_video': '0.2652', 'loss_audio': '0.4263', 'step': 8909, 'global_step': 8909} [2025-08-16 07:42:28] {'loss': '0.6713', 'loss_video': '0.2699', 'loss_audio': '0.4014', 'step': 8919, 'global_step': 8919} [2025-08-16 07:44:47] {'loss': '0.6833', 'loss_video': '0.2712', 'loss_audio': '0.4121', 'step': 8929, 'global_step': 8929} [2025-08-16 07:47:23] {'loss': '0.6252', 'loss_video': '0.2259', 'loss_audio': '0.3993', 'step': 8939, 'global_step': 8939} [2025-08-16 07:49:42] {'loss': '0.6504', 'loss_video': '0.2692', 'loss_audio': '0.3812', 'step': 8949, 'global_step': 8949} [2025-08-16 07:52:10] {'loss': '0.6600', 'loss_video': '0.2525', 'loss_audio': '0.4075', 'step': 8959, 'global_step': 8959} [2025-08-16 07:54:37] {'loss': '0.6680', 'loss_video': '0.2767', 'loss_audio': '0.3913', 'step': 8969, 'global_step': 8969} [2025-08-16 07:57:21] {'loss': '0.6719', 'loss_video': '0.2529', 'loss_audio': '0.4191', 'step': 8979, 'global_step': 8979} [2025-08-16 07:59:57] {'loss': '0.7082', 'loss_video': '0.2863', 'loss_audio': '0.4219', 'step': 8989, 'global_step': 8989} [2025-08-16 08:02:15] {'loss': '0.6782', 'loss_video': '0.2697', 'loss_audio': '0.4085', 'step': 8999, 'global_step': 8999} [2025-08-16 08:02:22] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 08:02:39] Saved checkpoint at epoch 0, step 9000, global_step 9000 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step9000 [2025-08-16 08:02:39] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step8500 has been deleted successfully as cfg.save_total_limit! [2025-08-16 08:05:19] {'loss': '0.6872', 'loss_video': '0.2865', 'loss_audio': '0.4007', 'step': 9009, 'global_step': 9009} [2025-08-16 08:08:01] {'loss': '0.6552', 'loss_video': '0.2712', 'loss_audio': '0.3840', 'step': 9019, 'global_step': 9019} [2025-08-16 08:10:08] {'loss': '0.7198', 'loss_video': '0.2826', 'loss_audio': '0.4373', 'step': 9029, 'global_step': 9029} [2025-08-16 08:12:56] {'loss': '0.6777', 'loss_video': '0.2461', 'loss_audio': '0.4316', 'step': 9039, 'global_step': 9039} [2025-08-16 08:15:25] {'loss': '0.6782', 'loss_video': '0.2496', 'loss_audio': '0.4287', 'step': 9049, 'global_step': 9049} [2025-08-16 08:17:49] {'loss': '0.6865', 'loss_video': '0.2681', 'loss_audio': '0.4184', 'step': 9059, 'global_step': 9059} [2025-08-16 08:20:23] {'loss': '0.6552', 'loss_video': '0.2499', 'loss_audio': '0.4052', 'step': 9069, 'global_step': 9069} [2025-08-16 08:22:52] {'loss': '0.7193', 'loss_video': '0.2892', 'loss_audio': '0.4302', 'step': 9079, 'global_step': 9079} [2025-08-16 08:25:37] {'loss': '0.6566', 'loss_video': '0.2574', 'loss_audio': '0.3992', 'step': 9089, 'global_step': 9089} [2025-08-16 08:27:49] {'loss': '0.6651', 'loss_video': '0.2482', 'loss_audio': '0.4168', 'step': 9099, 'global_step': 9099} [2025-08-16 08:30:04] {'loss': '0.6849', 'loss_video': '0.2695', 'loss_audio': '0.4155', 'step': 9109, 'global_step': 9109} [2025-08-16 08:32:25] {'loss': '0.6528', 'loss_video': '0.2467', 'loss_audio': '0.4061', 'step': 9119, 'global_step': 9119} [2025-08-16 08:34:50] {'loss': '0.6847', 'loss_video': '0.2744', 'loss_audio': '0.4103', 'step': 9129, 'global_step': 9129} [2025-08-16 08:37:39] {'loss': '0.6739', 'loss_video': '0.2627', 'loss_audio': '0.4113', 'step': 9139, 'global_step': 9139} [2025-08-16 08:40:15] {'loss': '0.6964', 'loss_video': '0.2899', 'loss_audio': '0.4064', 'step': 9149, 'global_step': 9149} [2025-08-16 08:42:46] {'loss': '0.7067', 'loss_video': '0.2913', 'loss_audio': '0.4154', 'step': 9159, 'global_step': 9159} [2025-08-16 08:45:01] {'loss': '0.6891', 'loss_video': '0.2613', 'loss_audio': '0.4278', 'step': 9169, 'global_step': 9169} [2025-08-16 08:47:33] {'loss': '0.7130', 'loss_video': '0.2673', 'loss_audio': '0.4458', 'step': 9179, 'global_step': 9179} [2025-08-16 08:49:42] {'loss': '0.6251', 'loss_video': '0.2369', 'loss_audio': '0.3883', 'step': 9189, 'global_step': 9189} [2025-08-16 08:52:11] {'loss': '0.7176', 'loss_video': '0.2648', 'loss_audio': '0.4529', 'step': 9199, 'global_step': 9199} [2025-08-16 08:54:39] {'loss': '0.6111', 'loss_video': '0.2230', 'loss_audio': '0.3880', 'step': 9209, 'global_step': 9209} [2025-08-16 08:57:08] {'loss': '0.7211', 'loss_video': '0.2991', 'loss_audio': '0.4219', 'step': 9219, 'global_step': 9219} [2025-08-16 08:59:58] {'loss': '0.6037', 'loss_video': '0.2339', 'loss_audio': '0.3699', 'step': 9229, 'global_step': 9229} [2025-08-16 09:02:28] {'loss': '0.7094', 'loss_video': '0.2631', 'loss_audio': '0.4463', 'step': 9239, 'global_step': 9239} [2025-08-16 09:04:53] {'loss': '0.6893', 'loss_video': '0.2664', 'loss_audio': '0.4229', 'step': 9249, 'global_step': 9249} [2025-08-16 09:04:59] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 09:05:16] Saved checkpoint at epoch 0, step 9250, global_step 9250 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step9250 [2025-08-16 09:05:17] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step8750 has been deleted successfully as cfg.save_total_limit! [2025-08-16 09:07:36] {'loss': '0.7023', 'loss_video': '0.2479', 'loss_audio': '0.4544', 'step': 9259, 'global_step': 9259} [2025-08-16 09:09:59] {'loss': '0.6629', 'loss_video': '0.2454', 'loss_audio': '0.4174', 'step': 9269, 'global_step': 9269} [2025-08-16 09:12:35] {'loss': '0.6601', 'loss_video': '0.2490', 'loss_audio': '0.4111', 'step': 9279, 'global_step': 9279} [2025-08-16 09:15:00] {'loss': '0.7050', 'loss_video': '0.2730', 'loss_audio': '0.4320', 'step': 9289, 'global_step': 9289} [2025-08-16 09:17:22] {'loss': '0.6988', 'loss_video': '0.2806', 'loss_audio': '0.4182', 'step': 9299, 'global_step': 9299} [2025-08-16 09:19:43] {'loss': '0.6488', 'loss_video': '0.2560', 'loss_audio': '0.3928', 'step': 9309, 'global_step': 9309} [2025-08-16 09:22:30] {'loss': '0.6962', 'loss_video': '0.2786', 'loss_audio': '0.4175', 'step': 9319, 'global_step': 9319} [2025-08-16 09:25:10] {'loss': '0.7278', 'loss_video': '0.2686', 'loss_audio': '0.4592', 'step': 9329, 'global_step': 9329} [2025-08-16 09:27:35] {'loss': '0.7260', 'loss_video': '0.2590', 'loss_audio': '0.4669', 'step': 9339, 'global_step': 9339} [2025-08-16 09:30:11] {'loss': '0.7068', 'loss_video': '0.2844', 'loss_audio': '0.4224', 'step': 9349, 'global_step': 9349} [2025-08-16 09:32:20] {'loss': '0.6830', 'loss_video': '0.2284', 'loss_audio': '0.4546', 'step': 9359, 'global_step': 9359} [2025-08-16 09:35:06] {'loss': '0.6921', 'loss_video': '0.3012', 'loss_audio': '0.3909', 'step': 9369, 'global_step': 9369} [2025-08-16 09:37:43] {'loss': '0.6529', 'loss_video': '0.2527', 'loss_audio': '0.4002', 'step': 9379, 'global_step': 9379} [2025-08-16 09:40:20] {'loss': '0.6857', 'loss_video': '0.2677', 'loss_audio': '0.4181', 'step': 9389, 'global_step': 9389} [2025-08-16 09:42:42] {'loss': '0.6627', 'loss_video': '0.2621', 'loss_audio': '0.4006', 'step': 9399, 'global_step': 9399} [2025-08-16 09:45:11] {'loss': '0.6754', 'loss_video': '0.2820', 'loss_audio': '0.3934', 'step': 9409, 'global_step': 9409} [2025-08-16 09:47:11] {'loss': '0.6980', 'loss_video': '0.2782', 'loss_audio': '0.4198', 'step': 9419, 'global_step': 9419} [2025-08-16 09:49:28] {'loss': '0.6339', 'loss_video': '0.2516', 'loss_audio': '0.3823', 'step': 9429, 'global_step': 9429} [2025-08-16 09:52:04] {'loss': '0.6509', 'loss_video': '0.2317', 'loss_audio': '0.4192', 'step': 9439, 'global_step': 9439} [2025-08-16 09:54:46] {'loss': '0.6737', 'loss_video': '0.2599', 'loss_audio': '0.4138', 'step': 9449, 'global_step': 9449} [2025-08-16 09:57:29] {'loss': '0.7414', 'loss_video': '0.2735', 'loss_audio': '0.4679', 'step': 9459, 'global_step': 9459} [2025-08-16 09:59:48] {'loss': '0.6783', 'loss_video': '0.2479', 'loss_audio': '0.4304', 'step': 9469, 'global_step': 9469} [2025-08-16 10:02:08] {'loss': '0.7138', 'loss_video': '0.2780', 'loss_audio': '0.4358', 'step': 9479, 'global_step': 9479} [2025-08-16 10:04:46] {'loss': '0.6389', 'loss_video': '0.2522', 'loss_audio': '0.3867', 'step': 9489, 'global_step': 9489} [2025-08-16 10:07:02] {'loss': '0.6780', 'loss_video': '0.2700', 'loss_audio': '0.4081', 'step': 9499, 'global_step': 9499} [2025-08-16 10:07:08] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 10:07:26] Saved checkpoint at epoch 0, step 9500, global_step 9500 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step9500 [2025-08-16 10:07:27] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step9000 has been deleted successfully as cfg.save_total_limit! [2025-08-16 10:09:56] {'loss': '0.7788', 'loss_video': '0.2937', 'loss_audio': '0.4850', 'step': 9509, 'global_step': 9509} [2025-08-16 10:12:12] {'loss': '0.6842', 'loss_video': '0.2570', 'loss_audio': '0.4272', 'step': 9519, 'global_step': 9519} [2025-08-16 10:14:30] {'loss': '0.7255', 'loss_video': '0.2689', 'loss_audio': '0.4567', 'step': 9529, 'global_step': 9529} [2025-08-16 10:17:03] {'loss': '0.6845', 'loss_video': '0.2592', 'loss_audio': '0.4252', 'step': 9539, 'global_step': 9539} [2025-08-16 10:19:31] {'loss': '0.6790', 'loss_video': '0.2757', 'loss_audio': '0.4032', 'step': 9549, 'global_step': 9549} [2025-08-16 10:22:13] {'loss': '0.7145', 'loss_video': '0.2743', 'loss_audio': '0.4403', 'step': 9559, 'global_step': 9559} [2025-08-16 10:24:54] {'loss': '0.7201', 'loss_video': '0.2810', 'loss_audio': '0.4391', 'step': 9569, 'global_step': 9569} [2025-08-16 10:27:15] {'loss': '0.6900', 'loss_video': '0.2706', 'loss_audio': '0.4193', 'step': 9579, 'global_step': 9579} [2025-08-16 10:29:48] {'loss': '0.6564', 'loss_video': '0.2718', 'loss_audio': '0.3846', 'step': 9589, 'global_step': 9589} [2025-08-16 10:32:18] {'loss': '0.6238', 'loss_video': '0.2332', 'loss_audio': '0.3906', 'step': 9599, 'global_step': 9599} [2025-08-16 10:34:31] {'loss': '0.6894', 'loss_video': '0.2570', 'loss_audio': '0.4324', 'step': 9609, 'global_step': 9609} [2025-08-16 10:36:56] {'loss': '0.6937', 'loss_video': '0.2703', 'loss_audio': '0.4234', 'step': 9619, 'global_step': 9619} [2025-08-16 10:39:17] {'loss': '0.6710', 'loss_video': '0.2557', 'loss_audio': '0.4153', 'step': 9629, 'global_step': 9629} [2025-08-16 10:41:54] {'loss': '0.6752', 'loss_video': '0.2806', 'loss_audio': '0.3946', 'step': 9639, 'global_step': 9639} [2025-08-16 10:44:15] {'loss': '0.7100', 'loss_video': '0.2892', 'loss_audio': '0.4208', 'step': 9649, 'global_step': 9649} [2025-08-16 10:46:50] {'loss': '0.5977', 'loss_video': '0.2430', 'loss_audio': '0.3547', 'step': 9659, 'global_step': 9659} [2025-08-16 10:49:09] {'loss': '0.6365', 'loss_video': '0.2191', 'loss_audio': '0.4174', 'step': 9669, 'global_step': 9669} [2025-08-16 10:51:16] {'loss': '0.6879', 'loss_video': '0.2627', 'loss_audio': '0.4253', 'step': 9679, 'global_step': 9679} [2025-08-16 10:53:44] {'loss': '0.6761', 'loss_video': '0.2657', 'loss_audio': '0.4104', 'step': 9689, 'global_step': 9689} [2025-08-16 10:56:08] {'loss': '0.6508', 'loss_video': '0.2323', 'loss_audio': '0.4185', 'step': 9699, 'global_step': 9699} [2025-08-16 10:58:20] {'loss': '0.6670', 'loss_video': '0.2716', 'loss_audio': '0.3954', 'step': 9709, 'global_step': 9709} [2025-08-16 11:00:49] {'loss': '0.6227', 'loss_video': '0.2263', 'loss_audio': '0.3964', 'step': 9719, 'global_step': 9719} [2025-08-16 11:03:01] {'loss': '0.6564', 'loss_video': '0.2287', 'loss_audio': '0.4277', 'step': 9729, 'global_step': 9729} [2025-08-16 11:05:42] {'loss': '0.7196', 'loss_video': '0.2725', 'loss_audio': '0.4471', 'step': 9739, 'global_step': 9739} [2025-08-16 11:08:19] {'loss': '0.7138', 'loss_video': '0.2732', 'loss_audio': '0.4406', 'step': 9749, 'global_step': 9749} [2025-08-16 11:08:25] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 11:08:42] Saved checkpoint at epoch 0, step 9750, global_step 9750 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step9750 [2025-08-16 11:08:43] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step9250 has been deleted successfully as cfg.save_total_limit! [2025-08-16 11:10:53] {'loss': '0.6747', 'loss_video': '0.2402', 'loss_audio': '0.4345', 'step': 9759, 'global_step': 9759} [2025-08-16 11:13:19] {'loss': '0.6336', 'loss_video': '0.2428', 'loss_audio': '0.3908', 'step': 9769, 'global_step': 9769} [2025-08-16 11:15:41] {'loss': '0.7173', 'loss_video': '0.2742', 'loss_audio': '0.4431', 'step': 9779, 'global_step': 9779} [2025-08-16 11:17:54] {'loss': '0.6593', 'loss_video': '0.2314', 'loss_audio': '0.4279', 'step': 9789, 'global_step': 9789} [2025-08-16 11:20:33] {'loss': '0.6969', 'loss_video': '0.2878', 'loss_audio': '0.4091', 'step': 9799, 'global_step': 9799} [2025-08-16 11:23:18] {'loss': '0.6971', 'loss_video': '0.2768', 'loss_audio': '0.4204', 'step': 9809, 'global_step': 9809} [2025-08-16 11:25:52] {'loss': '0.7795', 'loss_video': '0.3226', 'loss_audio': '0.4570', 'step': 9819, 'global_step': 9819} [2025-08-16 11:28:23] {'loss': '0.6346', 'loss_video': '0.2247', 'loss_audio': '0.4099', 'step': 9829, 'global_step': 9829} [2025-08-16 11:30:55] {'loss': '0.7626', 'loss_video': '0.3184', 'loss_audio': '0.4441', 'step': 9839, 'global_step': 9839} [2025-08-16 11:33:43] {'loss': '0.7586', 'loss_video': '0.3124', 'loss_audio': '0.4462', 'step': 9849, 'global_step': 9849} [2025-08-16 11:35:47] {'loss': '0.6909', 'loss_video': '0.2661', 'loss_audio': '0.4248', 'step': 9859, 'global_step': 9859} [2025-08-16 11:38:19] {'loss': '0.6841', 'loss_video': '0.2653', 'loss_audio': '0.4189', 'step': 9869, 'global_step': 9869} [2025-08-16 11:40:45] {'loss': '0.6588', 'loss_video': '0.2514', 'loss_audio': '0.4074', 'step': 9879, 'global_step': 9879} [2025-08-16 11:43:18] {'loss': '0.7075', 'loss_video': '0.2707', 'loss_audio': '0.4368', 'step': 9889, 'global_step': 9889} [2025-08-16 11:45:10] {'loss': '0.6972', 'loss_video': '0.2366', 'loss_audio': '0.4605', 'step': 9899, 'global_step': 9899} [2025-08-16 11:47:51] {'loss': '0.7419', 'loss_video': '0.2914', 'loss_audio': '0.4505', 'step': 9909, 'global_step': 9909} [2025-08-16 11:49:56] {'loss': '0.6512', 'loss_video': '0.2310', 'loss_audio': '0.4202', 'step': 9919, 'global_step': 9919} [2025-08-16 11:52:26] {'loss': '0.6678', 'loss_video': '0.2454', 'loss_audio': '0.4224', 'step': 9929, 'global_step': 9929} [2025-08-16 11:54:44] {'loss': '0.6905', 'loss_video': '0.2659', 'loss_audio': '0.4246', 'step': 9939, 'global_step': 9939} [2025-08-16 11:57:09] {'loss': '0.6926', 'loss_video': '0.2718', 'loss_audio': '0.4208', 'step': 9949, 'global_step': 9949} [2025-08-16 11:59:51] {'loss': '0.6487', 'loss_video': '0.2326', 'loss_audio': '0.4162', 'step': 9959, 'global_step': 9959} [2025-08-16 12:01:55] {'loss': '0.6759', 'loss_video': '0.2693', 'loss_audio': '0.4066', 'step': 9969, 'global_step': 9969} [2025-08-16 12:04:31] {'loss': '0.6748', 'loss_video': '0.2519', 'loss_audio': '0.4230', 'step': 9979, 'global_step': 9979} [2025-08-16 12:06:47] {'loss': '0.6670', 'loss_video': '0.2654', 'loss_audio': '0.4016', 'step': 9989, 'global_step': 9989} [2025-08-16 12:09:25] {'loss': '0.6709', 'loss_video': '0.2656', 'loss_audio': '0.4053', 'step': 9999, 'global_step': 9999} [2025-08-16 12:09:31] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 12:09:49] Saved checkpoint at epoch 0, step 10000, global_step 10000 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step10000 [2025-08-16 12:09:49] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step9500 has been deleted successfully as cfg.save_total_limit! [2025-08-16 12:12:20] {'loss': '0.6237', 'loss_video': '0.2365', 'loss_audio': '0.3872', 'step': 10009, 'global_step': 10009} [2025-08-16 12:15:01] {'loss': '0.6990', 'loss_video': '0.2505', 'loss_audio': '0.4485', 'step': 10019, 'global_step': 10019} [2025-08-16 12:17:40] {'loss': '0.6740', 'loss_video': '0.2809', 'loss_audio': '0.3932', 'step': 10029, 'global_step': 10029} [2025-08-16 12:20:16] {'loss': '0.7634', 'loss_video': '0.2837', 'loss_audio': '0.4797', 'step': 10039, 'global_step': 10039} [2025-08-16 12:22:37] {'loss': '0.7021', 'loss_video': '0.2717', 'loss_audio': '0.4304', 'step': 10049, 'global_step': 10049} [2025-08-16 12:24:41] {'loss': '0.6804', 'loss_video': '0.2771', 'loss_audio': '0.4033', 'step': 10059, 'global_step': 10059} [2025-08-16 12:27:17] {'loss': '0.7122', 'loss_video': '0.2925', 'loss_audio': '0.4197', 'step': 10069, 'global_step': 10069} [2025-08-16 12:29:54] {'loss': '0.7279', 'loss_video': '0.2945', 'loss_audio': '0.4334', 'step': 10079, 'global_step': 10079} [2025-08-16 12:32:19] {'loss': '0.7094', 'loss_video': '0.2764', 'loss_audio': '0.4329', 'step': 10089, 'global_step': 10089} [2025-08-16 12:34:54] {'loss': '0.6840', 'loss_video': '0.2730', 'loss_audio': '0.4109', 'step': 10099, 'global_step': 10099} [2025-08-16 12:37:40] {'loss': '0.6712', 'loss_video': '0.2726', 'loss_audio': '0.3985', 'step': 10109, 'global_step': 10109} [2025-08-16 12:40:19] {'loss': '0.6829', 'loss_video': '0.2664', 'loss_audio': '0.4165', 'step': 10119, 'global_step': 10119} [2025-08-16 12:42:46] {'loss': '0.7156', 'loss_video': '0.2898', 'loss_audio': '0.4258', 'step': 10129, 'global_step': 10129} [2025-08-16 12:45:13] {'loss': '0.6318', 'loss_video': '0.2133', 'loss_audio': '0.4185', 'step': 10139, 'global_step': 10139} [2025-08-16 12:47:43] {'loss': '0.7048', 'loss_video': '0.2856', 'loss_audio': '0.4191', 'step': 10149, 'global_step': 10149} [2025-08-16 12:50:23] {'loss': '0.6277', 'loss_video': '0.2431', 'loss_audio': '0.3846', 'step': 10159, 'global_step': 10159} [2025-08-16 12:52:37] {'loss': '0.6537', 'loss_video': '0.2613', 'loss_audio': '0.3924', 'step': 10169, 'global_step': 10169} [2025-08-16 12:55:15] {'loss': '0.6955', 'loss_video': '0.2643', 'loss_audio': '0.4312', 'step': 10179, 'global_step': 10179} [2025-08-16 12:57:50] {'loss': '0.7206', 'loss_video': '0.2715', 'loss_audio': '0.4491', 'step': 10189, 'global_step': 10189} [2025-08-16 13:00:17] {'loss': '0.6974', 'loss_video': '0.2888', 'loss_audio': '0.4086', 'step': 10199, 'global_step': 10199} [2025-08-16 13:02:53] {'loss': '0.7151', 'loss_video': '0.2922', 'loss_audio': '0.4229', 'step': 10209, 'global_step': 10209} [2025-08-16 13:05:21] {'loss': '0.6876', 'loss_video': '0.2874', 'loss_audio': '0.4002', 'step': 10219, 'global_step': 10219} [2025-08-16 13:08:02] {'loss': '0.6866', 'loss_video': '0.2660', 'loss_audio': '0.4206', 'step': 10229, 'global_step': 10229} [2025-08-16 13:10:54] {'loss': '0.6652', 'loss_video': '0.2425', 'loss_audio': '0.4226', 'step': 10239, 'global_step': 10239} [2025-08-16 13:13:07] {'loss': '0.6925', 'loss_video': '0.2570', 'loss_audio': '0.4355', 'step': 10249, 'global_step': 10249} [2025-08-16 13:13:13] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 13:13:31] Saved checkpoint at epoch 0, step 10250, global_step 10250 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step10250 [2025-08-16 13:13:32] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step9750 has been deleted successfully as cfg.save_total_limit! [2025-08-16 13:15:51] {'loss': '0.6419', 'loss_video': '0.2379', 'loss_audio': '0.4040', 'step': 10259, 'global_step': 10259} [2025-08-16 13:17:55] {'loss': '0.6899', 'loss_video': '0.2513', 'loss_audio': '0.4386', 'step': 10269, 'global_step': 10269} [2025-08-16 13:20:10] {'loss': '0.6971', 'loss_video': '0.2794', 'loss_audio': '0.4178', 'step': 10279, 'global_step': 10279} [2025-08-16 13:22:51] {'loss': '0.6727', 'loss_video': '0.2645', 'loss_audio': '0.4082', 'step': 10289, 'global_step': 10289} [2025-08-16 13:25:19] {'loss': '0.6263', 'loss_video': '0.2312', 'loss_audio': '0.3951', 'step': 10299, 'global_step': 10299} [2025-08-16 13:27:42] {'loss': '0.6885', 'loss_video': '0.2756', 'loss_audio': '0.4130', 'step': 10309, 'global_step': 10309} [2025-08-16 13:30:01] {'loss': '0.6958', 'loss_video': '0.2785', 'loss_audio': '0.4174', 'step': 10319, 'global_step': 10319} [2025-08-16 13:32:27] {'loss': '0.6846', 'loss_video': '0.2604', 'loss_audio': '0.4242', 'step': 10329, 'global_step': 10329} [2025-08-16 13:34:55] {'loss': '0.7615', 'loss_video': '0.2799', 'loss_audio': '0.4817', 'step': 10339, 'global_step': 10339} [2025-08-16 13:37:12] {'loss': '0.6650', 'loss_video': '0.2727', 'loss_audio': '0.3922', 'step': 10349, 'global_step': 10349} [2025-08-16 13:39:30] {'loss': '0.6504', 'loss_video': '0.2507', 'loss_audio': '0.3997', 'step': 10359, 'global_step': 10359} [2025-08-16 13:42:05] {'loss': '0.6752', 'loss_video': '0.2688', 'loss_audio': '0.4064', 'step': 10369, 'global_step': 10369} [2025-08-16 13:44:37] {'loss': '0.6497', 'loss_video': '0.2488', 'loss_audio': '0.4009', 'step': 10379, 'global_step': 10379} [2025-08-16 13:47:11] {'loss': '0.6885', 'loss_video': '0.2697', 'loss_audio': '0.4188', 'step': 10389, 'global_step': 10389} [2025-08-16 13:49:31] {'loss': '0.6572', 'loss_video': '0.2446', 'loss_audio': '0.4126', 'step': 10399, 'global_step': 10399} [2025-08-16 13:51:43] {'loss': '0.6736', 'loss_video': '0.2544', 'loss_audio': '0.4192', 'step': 10409, 'global_step': 10409} [2025-08-16 13:54:15] {'loss': '0.6891', 'loss_video': '0.2561', 'loss_audio': '0.4331', 'step': 10419, 'global_step': 10419} [2025-08-16 13:56:46] {'loss': '0.7431', 'loss_video': '0.2708', 'loss_audio': '0.4722', 'step': 10429, 'global_step': 10429} [2025-08-16 13:59:05] {'loss': '0.6738', 'loss_video': '0.2737', 'loss_audio': '0.4000', 'step': 10439, 'global_step': 10439} [2025-08-16 14:01:24] {'loss': '0.6479', 'loss_video': '0.2468', 'loss_audio': '0.4011', 'step': 10449, 'global_step': 10449} [2025-08-16 14:03:45] {'loss': '0.6828', 'loss_video': '0.2625', 'loss_audio': '0.4203', 'step': 10459, 'global_step': 10459} [2025-08-16 14:05:48] {'loss': '0.6296', 'loss_video': '0.2494', 'loss_audio': '0.3802', 'step': 10469, 'global_step': 10469} [2025-08-16 14:08:06] {'loss': '0.6688', 'loss_video': '0.2467', 'loss_audio': '0.4220', 'step': 10479, 'global_step': 10479} [2025-08-16 14:10:44] {'loss': '0.7065', 'loss_video': '0.2590', 'loss_audio': '0.4475', 'step': 10489, 'global_step': 10489} [2025-08-16 14:13:01] {'loss': '0.6646', 'loss_video': '0.2615', 'loss_audio': '0.4031', 'step': 10499, 'global_step': 10499} [2025-08-16 14:13:08] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 14:13:26] Saved checkpoint at epoch 0, step 10500, global_step 10500 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step10500 [2025-08-16 14:13:26] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step10000 has been deleted successfully as cfg.save_total_limit! [2025-08-16 14:16:00] {'loss': '0.7167', 'loss_video': '0.2772', 'loss_audio': '0.4395', 'step': 10509, 'global_step': 10509} [2025-08-16 14:18:32] {'loss': '0.6358', 'loss_video': '0.2379', 'loss_audio': '0.3979', 'step': 10519, 'global_step': 10519} [2025-08-16 14:20:57] {'loss': '0.6401', 'loss_video': '0.2454', 'loss_audio': '0.3947', 'step': 10529, 'global_step': 10529} [2025-08-16 14:23:11] {'loss': '0.6482', 'loss_video': '0.2297', 'loss_audio': '0.4184', 'step': 10539, 'global_step': 10539} [2025-08-16 14:25:29] {'loss': '0.6629', 'loss_video': '0.2592', 'loss_audio': '0.4037', 'step': 10549, 'global_step': 10549} [2025-08-16 14:27:15] Building buckets... [2025-08-16 14:27:19] Bucket Info: [2025-08-16 14:27:19] Bucket [#sample, #batch] by aspect ratio: {'0.38': [73, 8], '0.43': [269, 41], '0.48': [48, 3], '0.50': [82, 6], '0.53': [165, 22], '0.54': [578, 72], '0.56': [94859, 16925], '0.62': [844, 129], '0.67': [2354, 312], '0.75': [34023, 3485], '1.00': [303, 27], '1.33': [268, 21], '1.50': [76, 7], '1.78': [870, 89]} [2025-08-16 14:27:19] Image Bucket [#sample, #batch] by HxWxT: {} [2025-08-16 14:27:19] Video Bucket [#sample, #batch] by HxWxT: {('480p', 81): [8991, 2991], ('480p', 65): [16275, 4064], ('480p', 49): [13032, 3252], ('480p', 33): [7904, 1576], ('360p', 81): [7022, 1399], ('360p', 65): [5429, 899], ('360p', 49): [6709, 1113], ('360p', 33): [7842, 976], ('240p', 81): [12136, 1206], ('240p', 65): [14663, 1217], ('240p', 49): [13955, 1158], ('240p', 33): [20854, 1296]} [2025-08-16 14:27:19] #training batch: 20.65 K, #training sample: 131.65 K, #non empty bucket: 164 [2025-08-16 14:27:19] Beginning epoch 1... [2025-08-16 14:28:30] {'loss': '0.6372', 'loss_video': '0.2420', 'loss_audio': '0.3952', 'step': 2, 'global_step': 10559} [2025-08-16 14:30:56] {'loss': '0.6268', 'loss_video': '0.2429', 'loss_audio': '0.3839', 'step': 12, 'global_step': 10569} [2025-08-16 14:33:07] {'loss': '0.6300', 'loss_video': '0.2354', 'loss_audio': '0.3946', 'step': 22, 'global_step': 10579} [2025-08-16 14:35:41] {'loss': '0.6411', 'loss_video': '0.2396', 'loss_audio': '0.4015', 'step': 32, 'global_step': 10589} [2025-08-16 14:38:00] {'loss': '0.6616', 'loss_video': '0.2724', 'loss_audio': '0.3892', 'step': 42, 'global_step': 10599} [2025-08-16 14:40:18] {'loss': '0.6205', 'loss_video': '0.2438', 'loss_audio': '0.3767', 'step': 52, 'global_step': 10609} [2025-08-16 14:42:36] {'loss': '0.6901', 'loss_video': '0.2689', 'loss_audio': '0.4212', 'step': 62, 'global_step': 10619} [2025-08-16 14:45:14] {'loss': '0.6484', 'loss_video': '0.2290', 'loss_audio': '0.4193', 'step': 72, 'global_step': 10629} [2025-08-16 14:47:42] {'loss': '0.7001', 'loss_video': '0.2794', 'loss_audio': '0.4207', 'step': 82, 'global_step': 10639} [2025-08-16 14:50:21] {'loss': '0.7328', 'loss_video': '0.2919', 'loss_audio': '0.4409', 'step': 92, 'global_step': 10649} [2025-08-16 14:53:07] {'loss': '0.7157', 'loss_video': '0.3156', 'loss_audio': '0.4001', 'step': 102, 'global_step': 10659} [2025-08-16 14:55:11] {'loss': '0.6419', 'loss_video': '0.2391', 'loss_audio': '0.4028', 'step': 112, 'global_step': 10669} [2025-08-16 14:57:34] {'loss': '0.6844', 'loss_video': '0.2482', 'loss_audio': '0.4362', 'step': 122, 'global_step': 10679} [2025-08-16 15:00:00] {'loss': '0.6285', 'loss_video': '0.2410', 'loss_audio': '0.3875', 'step': 132, 'global_step': 10689} [2025-08-16 15:02:19] {'loss': '0.6929', 'loss_video': '0.2676', 'loss_audio': '0.4253', 'step': 142, 'global_step': 10699} [2025-08-16 15:04:45] {'loss': '0.7129', 'loss_video': '0.2739', 'loss_audio': '0.4390', 'step': 152, 'global_step': 10709} [2025-08-16 15:06:53] {'loss': '0.6932', 'loss_video': '0.2961', 'loss_audio': '0.3972', 'step': 162, 'global_step': 10719} [2025-08-16 15:09:18] {'loss': '0.6463', 'loss_video': '0.2439', 'loss_audio': '0.4024', 'step': 172, 'global_step': 10729} [2025-08-16 15:11:59] {'loss': '0.6231', 'loss_video': '0.2498', 'loss_audio': '0.3734', 'step': 182, 'global_step': 10739} [2025-08-16 15:14:35] {'loss': '0.7417', 'loss_video': '0.2873', 'loss_audio': '0.4544', 'step': 192, 'global_step': 10749} [2025-08-16 15:14:42] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 15:14:59] Saved checkpoint at epoch 1, step 193, global_step 10750 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step10750 [2025-08-16 15:14:59] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step10250 has been deleted successfully as cfg.save_total_limit! [2025-08-16 15:17:50] {'loss': '0.7190', 'loss_video': '0.2850', 'loss_audio': '0.4339', 'step': 202, 'global_step': 10759} [2025-08-16 15:20:22] {'loss': '0.5926', 'loss_video': '0.2306', 'loss_audio': '0.3620', 'step': 212, 'global_step': 10769} [2025-08-16 15:22:43] {'loss': '0.6828', 'loss_video': '0.2850', 'loss_audio': '0.3978', 'step': 222, 'global_step': 10779} [2025-08-16 15:25:24] {'loss': '0.6944', 'loss_video': '0.3031', 'loss_audio': '0.3912', 'step': 232, 'global_step': 10789} [2025-08-16 15:27:37] {'loss': '0.7419', 'loss_video': '0.2964', 'loss_audio': '0.4454', 'step': 242, 'global_step': 10799} [2025-08-16 15:30:09] {'loss': '0.6702', 'loss_video': '0.2727', 'loss_audio': '0.3975', 'step': 252, 'global_step': 10809} [2025-08-16 15:32:32] {'loss': '0.6785', 'loss_video': '0.2392', 'loss_audio': '0.4393', 'step': 262, 'global_step': 10819} [2025-08-16 15:35:00] {'loss': '0.6379', 'loss_video': '0.2530', 'loss_audio': '0.3848', 'step': 272, 'global_step': 10829} [2025-08-16 15:37:29] {'loss': '0.7085', 'loss_video': '0.2760', 'loss_audio': '0.4325', 'step': 282, 'global_step': 10839} [2025-08-16 15:39:52] {'loss': '0.7142', 'loss_video': '0.2865', 'loss_audio': '0.4277', 'step': 292, 'global_step': 10849} [2025-08-16 15:42:14] {'loss': '0.7380', 'loss_video': '0.2918', 'loss_audio': '0.4462', 'step': 302, 'global_step': 10859} [2025-08-16 15:44:32] {'loss': '0.6900', 'loss_video': '0.2715', 'loss_audio': '0.4185', 'step': 312, 'global_step': 10869} [2025-08-16 15:46:42] {'loss': '0.6787', 'loss_video': '0.2653', 'loss_audio': '0.4134', 'step': 322, 'global_step': 10879} [2025-08-16 15:49:14] {'loss': '0.7317', 'loss_video': '0.2824', 'loss_audio': '0.4492', 'step': 332, 'global_step': 10889} [2025-08-16 15:51:40] {'loss': '0.6697', 'loss_video': '0.2701', 'loss_audio': '0.3996', 'step': 342, 'global_step': 10899} [2025-08-16 15:54:01] {'loss': '0.6310', 'loss_video': '0.2180', 'loss_audio': '0.4129', 'step': 352, 'global_step': 10909} [2025-08-16 15:56:26] {'loss': '0.6494', 'loss_video': '0.2529', 'loss_audio': '0.3965', 'step': 362, 'global_step': 10919} [2025-08-16 15:58:51] {'loss': '0.7184', 'loss_video': '0.2866', 'loss_audio': '0.4318', 'step': 372, 'global_step': 10929} [2025-08-16 16:01:19] {'loss': '0.6095', 'loss_video': '0.2268', 'loss_audio': '0.3827', 'step': 382, 'global_step': 10939} [2025-08-16 16:03:52] {'loss': '0.5790', 'loss_video': '0.1963', 'loss_audio': '0.3827', 'step': 392, 'global_step': 10949} [2025-08-16 16:06:16] {'loss': '0.7096', 'loss_video': '0.2960', 'loss_audio': '0.4136', 'step': 402, 'global_step': 10959} [2025-08-16 16:08:40] {'loss': '0.6448', 'loss_video': '0.2511', 'loss_audio': '0.3937', 'step': 412, 'global_step': 10969} [2025-08-16 16:11:02] {'loss': '0.7023', 'loss_video': '0.2825', 'loss_audio': '0.4198', 'step': 422, 'global_step': 10979} [2025-08-16 16:13:55] {'loss': '0.7326', 'loss_video': '0.2898', 'loss_audio': '0.4427', 'step': 432, 'global_step': 10989} [2025-08-16 16:16:25] {'loss': '0.6513', 'loss_video': '0.2661', 'loss_audio': '0.3852', 'step': 442, 'global_step': 10999} [2025-08-16 16:16:31] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 16:16:50] Saved checkpoint at epoch 1, step 443, global_step 11000 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step11000 [2025-08-16 16:16:50] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch000-global_step10500 has been deleted successfully as cfg.save_total_limit! [2025-08-16 16:19:30] {'loss': '0.7135', 'loss_video': '0.2903', 'loss_audio': '0.4232', 'step': 452, 'global_step': 11009} [2025-08-16 16:22:00] {'loss': '0.7174', 'loss_video': '0.2992', 'loss_audio': '0.4183', 'step': 462, 'global_step': 11019} [2025-08-16 16:24:19] {'loss': '0.6875', 'loss_video': '0.2343', 'loss_audio': '0.4532', 'step': 472, 'global_step': 11029} [2025-08-16 16:26:40] {'loss': '0.7962', 'loss_video': '0.2952', 'loss_audio': '0.5010', 'step': 482, 'global_step': 11039} [2025-08-16 16:28:57] {'loss': '0.7443', 'loss_video': '0.2741', 'loss_audio': '0.4702', 'step': 492, 'global_step': 11049} [2025-08-16 16:31:19] {'loss': '0.7319', 'loss_video': '0.2607', 'loss_audio': '0.4712', 'step': 502, 'global_step': 11059} [2025-08-16 16:33:59] {'loss': '0.6558', 'loss_video': '0.2624', 'loss_audio': '0.3934', 'step': 512, 'global_step': 11069} [2025-08-16 16:36:31] {'loss': '0.6232', 'loss_video': '0.2428', 'loss_audio': '0.3804', 'step': 522, 'global_step': 11079} [2025-08-16 16:39:09] {'loss': '0.6458', 'loss_video': '0.2495', 'loss_audio': '0.3963', 'step': 532, 'global_step': 11089} [2025-08-16 16:41:45] {'loss': '0.7816', 'loss_video': '0.3179', 'loss_audio': '0.4637', 'step': 542, 'global_step': 11099} [2025-08-16 16:43:57] {'loss': '0.6706', 'loss_video': '0.2523', 'loss_audio': '0.4183', 'step': 552, 'global_step': 11109} [2025-08-16 16:46:29] {'loss': '0.7035', 'loss_video': '0.2773', 'loss_audio': '0.4262', 'step': 562, 'global_step': 11119} [2025-08-16 16:48:50] {'loss': '0.6317', 'loss_video': '0.2364', 'loss_audio': '0.3952', 'step': 572, 'global_step': 11129} [2025-08-16 16:51:27] {'loss': '0.6677', 'loss_video': '0.2936', 'loss_audio': '0.3741', 'step': 582, 'global_step': 11139} [2025-08-16 16:53:38] {'loss': '0.6176', 'loss_video': '0.2219', 'loss_audio': '0.3957', 'step': 592, 'global_step': 11149} [2025-08-16 16:56:11] {'loss': '0.6571', 'loss_video': '0.2386', 'loss_audio': '0.4185', 'step': 602, 'global_step': 11159} [2025-08-16 16:58:45] {'loss': '0.6840', 'loss_video': '0.2634', 'loss_audio': '0.4206', 'step': 612, 'global_step': 11169} [2025-08-16 17:01:18] {'loss': '0.6892', 'loss_video': '0.2789', 'loss_audio': '0.4102', 'step': 622, 'global_step': 11179} [2025-08-16 17:03:52] {'loss': '0.6593', 'loss_video': '0.2560', 'loss_audio': '0.4033', 'step': 632, 'global_step': 11189} [2025-08-16 17:06:33] {'loss': '0.6403', 'loss_video': '0.2742', 'loss_audio': '0.3660', 'step': 642, 'global_step': 11199} [2025-08-16 17:09:00] {'loss': '0.7702', 'loss_video': '0.2829', 'loss_audio': '0.4872', 'step': 652, 'global_step': 11209} [2025-08-16 17:11:36] {'loss': '0.7201', 'loss_video': '0.2588', 'loss_audio': '0.4614', 'step': 662, 'global_step': 11219} [2025-08-16 17:13:54] {'loss': '0.6500', 'loss_video': '0.2424', 'loss_audio': '0.4076', 'step': 672, 'global_step': 11229} [2025-08-16 17:16:23] {'loss': '0.6956', 'loss_video': '0.2671', 'loss_audio': '0.4285', 'step': 682, 'global_step': 11239} [2025-08-16 17:18:58] {'loss': '0.6431', 'loss_video': '0.2324', 'loss_audio': '0.4107', 'step': 692, 'global_step': 11249} [2025-08-16 17:19:05] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 17:19:22] Saved checkpoint at epoch 1, step 693, global_step 11250 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step11250 [2025-08-16 17:19:22] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step10750 has been deleted successfully as cfg.save_total_limit! [2025-08-16 17:21:57] {'loss': '0.7507', 'loss_video': '0.2754', 'loss_audio': '0.4753', 'step': 702, 'global_step': 11259} [2025-08-16 17:24:26] {'loss': '0.6823', 'loss_video': '0.2665', 'loss_audio': '0.4158', 'step': 712, 'global_step': 11269} [2025-08-16 17:26:57] {'loss': '0.6467', 'loss_video': '0.2614', 'loss_audio': '0.3853', 'step': 722, 'global_step': 11279} [2025-08-16 17:29:39] {'loss': '0.6710', 'loss_video': '0.2623', 'loss_audio': '0.4087', 'step': 732, 'global_step': 11289} [2025-08-16 17:31:55] {'loss': '0.7225', 'loss_video': '0.2838', 'loss_audio': '0.4387', 'step': 742, 'global_step': 11299} [2025-08-16 17:34:23] {'loss': '0.6782', 'loss_video': '0.2552', 'loss_audio': '0.4230', 'step': 752, 'global_step': 11309} [2025-08-16 17:36:54] {'loss': '0.6201', 'loss_video': '0.2214', 'loss_audio': '0.3987', 'step': 762, 'global_step': 11319} [2025-08-16 17:39:33] {'loss': '0.6205', 'loss_video': '0.2221', 'loss_audio': '0.3984', 'step': 772, 'global_step': 11329} [2025-08-16 17:41:39] {'loss': '0.6751', 'loss_video': '0.2763', 'loss_audio': '0.3988', 'step': 782, 'global_step': 11339} [2025-08-16 17:44:15] {'loss': '0.6664', 'loss_video': '0.2710', 'loss_audio': '0.3954', 'step': 792, 'global_step': 11349} [2025-08-16 17:46:42] {'loss': '0.6433', 'loss_video': '0.2590', 'loss_audio': '0.3842', 'step': 802, 'global_step': 11359} [2025-08-16 17:49:08] {'loss': '0.6884', 'loss_video': '0.2684', 'loss_audio': '0.4199', 'step': 812, 'global_step': 11369} [2025-08-16 17:51:47] {'loss': '0.6145', 'loss_video': '0.2230', 'loss_audio': '0.3914', 'step': 822, 'global_step': 11379} [2025-08-16 17:54:19] {'loss': '0.6854', 'loss_video': '0.2573', 'loss_audio': '0.4281', 'step': 832, 'global_step': 11389} [2025-08-16 17:56:57] {'loss': '0.7791', 'loss_video': '0.2936', 'loss_audio': '0.4855', 'step': 842, 'global_step': 11399} [2025-08-16 17:59:37] {'loss': '0.7195', 'loss_video': '0.2801', 'loss_audio': '0.4393', 'step': 852, 'global_step': 11409} [2025-08-16 18:02:16] {'loss': '0.6979', 'loss_video': '0.2812', 'loss_audio': '0.4167', 'step': 862, 'global_step': 11419} [2025-08-16 18:04:47] {'loss': '0.7539', 'loss_video': '0.3052', 'loss_audio': '0.4488', 'step': 872, 'global_step': 11429} [2025-08-16 18:07:32] {'loss': '0.7017', 'loss_video': '0.2656', 'loss_audio': '0.4361', 'step': 882, 'global_step': 11439} [2025-08-16 18:09:51] {'loss': '0.7134', 'loss_video': '0.3103', 'loss_audio': '0.4031', 'step': 892, 'global_step': 11449} [2025-08-16 18:12:28] {'loss': '0.6769', 'loss_video': '0.2672', 'loss_audio': '0.4096', 'step': 902, 'global_step': 11459} [2025-08-16 18:15:19] {'loss': '0.6937', 'loss_video': '0.2785', 'loss_audio': '0.4152', 'step': 912, 'global_step': 11469} [2025-08-16 18:17:30] {'loss': '0.6515', 'loss_video': '0.2500', 'loss_audio': '0.4015', 'step': 922, 'global_step': 11479} [2025-08-16 18:20:13] {'loss': '0.6190', 'loss_video': '0.2262', 'loss_audio': '0.3928', 'step': 932, 'global_step': 11489} [2025-08-16 18:22:31] {'loss': '0.7227', 'loss_video': '0.2574', 'loss_audio': '0.4653', 'step': 942, 'global_step': 11499} [2025-08-16 18:22:38] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 18:22:56] Saved checkpoint at epoch 1, step 943, global_step 11500 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step11500 [2025-08-16 18:22:56] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step11000 has been deleted successfully as cfg.save_total_limit! [2025-08-16 18:25:13] {'loss': '0.6297', 'loss_video': '0.2254', 'loss_audio': '0.4043', 'step': 952, 'global_step': 11509} [2025-08-16 18:27:50] {'loss': '0.6926', 'loss_video': '0.2512', 'loss_audio': '0.4414', 'step': 962, 'global_step': 11519} [2025-08-16 18:30:14] {'loss': '0.6910', 'loss_video': '0.2660', 'loss_audio': '0.4250', 'step': 972, 'global_step': 11529} [2025-08-16 18:32:25] {'loss': '0.6487', 'loss_video': '0.2449', 'loss_audio': '0.4038', 'step': 982, 'global_step': 11539} [2025-08-16 18:34:34] {'loss': '0.6410', 'loss_video': '0.2445', 'loss_audio': '0.3965', 'step': 992, 'global_step': 11549} [2025-08-16 18:37:16] {'loss': '0.6529', 'loss_video': '0.2330', 'loss_audio': '0.4198', 'step': 1002, 'global_step': 11559} [2025-08-16 18:39:43] {'loss': '0.6776', 'loss_video': '0.2768', 'loss_audio': '0.4008', 'step': 1012, 'global_step': 11569} [2025-08-16 18:41:52] {'loss': '0.6498', 'loss_video': '0.2543', 'loss_audio': '0.3956', 'step': 1022, 'global_step': 11579} [2025-08-16 18:44:47] {'loss': '0.6704', 'loss_video': '0.2664', 'loss_audio': '0.4040', 'step': 1032, 'global_step': 11589} [2025-08-16 18:47:06] {'loss': '0.6854', 'loss_video': '0.2479', 'loss_audio': '0.4375', 'step': 1042, 'global_step': 11599} [2025-08-16 18:49:32] {'loss': '0.7942', 'loss_video': '0.3225', 'loss_audio': '0.4718', 'step': 1052, 'global_step': 11609} [2025-08-16 18:51:58] {'loss': '0.6955', 'loss_video': '0.2526', 'loss_audio': '0.4429', 'step': 1062, 'global_step': 11619} [2025-08-16 18:54:30] {'loss': '0.6351', 'loss_video': '0.2434', 'loss_audio': '0.3917', 'step': 1072, 'global_step': 11629} [2025-08-16 18:56:48] {'loss': '0.7096', 'loss_video': '0.2913', 'loss_audio': '0.4184', 'step': 1082, 'global_step': 11639} [2025-08-16 18:59:19] {'loss': '0.6871', 'loss_video': '0.2769', 'loss_audio': '0.4102', 'step': 1092, 'global_step': 11649} [2025-08-16 19:02:06] {'loss': '0.7132', 'loss_video': '0.2819', 'loss_audio': '0.4313', 'step': 1102, 'global_step': 11659} [2025-08-16 19:04:26] {'loss': '0.7458', 'loss_video': '0.3041', 'loss_audio': '0.4418', 'step': 1112, 'global_step': 11669} [2025-08-16 19:06:56] {'loss': '0.6756', 'loss_video': '0.2514', 'loss_audio': '0.4242', 'step': 1122, 'global_step': 11679} [2025-08-16 19:09:40] {'loss': '0.6803', 'loss_video': '0.2629', 'loss_audio': '0.4174', 'step': 1132, 'global_step': 11689} [2025-08-16 19:12:04] {'loss': '0.6830', 'loss_video': '0.2540', 'loss_audio': '0.4290', 'step': 1142, 'global_step': 11699} [2025-08-16 19:14:48] {'loss': '0.6604', 'loss_video': '0.2647', 'loss_audio': '0.3956', 'step': 1152, 'global_step': 11709} [2025-08-16 19:17:30] {'loss': '0.6678', 'loss_video': '0.2535', 'loss_audio': '0.4143', 'step': 1162, 'global_step': 11719} [2025-08-16 19:20:14] {'loss': '0.6927', 'loss_video': '0.2656', 'loss_audio': '0.4270', 'step': 1172, 'global_step': 11729} [2025-08-16 19:22:53] {'loss': '0.6940', 'loss_video': '0.2740', 'loss_audio': '0.4200', 'step': 1182, 'global_step': 11739} [2025-08-16 19:25:17] {'loss': '0.7247', 'loss_video': '0.2762', 'loss_audio': '0.4485', 'step': 1192, 'global_step': 11749} [2025-08-16 19:25:23] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 19:25:41] Saved checkpoint at epoch 1, step 1193, global_step 11750 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step11750 [2025-08-16 19:25:41] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step11250 has been deleted successfully as cfg.save_total_limit! [2025-08-16 19:28:11] {'loss': '0.7207', 'loss_video': '0.2689', 'loss_audio': '0.4517', 'step': 1202, 'global_step': 11759} [2025-08-16 19:30:37] {'loss': '0.6852', 'loss_video': '0.2738', 'loss_audio': '0.4114', 'step': 1212, 'global_step': 11769} [2025-08-16 19:33:10] {'loss': '0.7665', 'loss_video': '0.2854', 'loss_audio': '0.4811', 'step': 1222, 'global_step': 11779} [2025-08-16 19:36:06] {'loss': '0.7371', 'loss_video': '0.3013', 'loss_audio': '0.4358', 'step': 1232, 'global_step': 11789} [2025-08-16 19:38:24] {'loss': '0.6408', 'loss_video': '0.2450', 'loss_audio': '0.3958', 'step': 1242, 'global_step': 11799} [2025-08-16 19:40:43] {'loss': '0.6236', 'loss_video': '0.2338', 'loss_audio': '0.3898', 'step': 1252, 'global_step': 11809} [2025-08-16 19:43:05] {'loss': '0.6500', 'loss_video': '0.2517', 'loss_audio': '0.3983', 'step': 1262, 'global_step': 11819} [2025-08-16 19:45:26] {'loss': '0.6919', 'loss_video': '0.2701', 'loss_audio': '0.4218', 'step': 1272, 'global_step': 11829} [2025-08-16 19:47:53] {'loss': '0.7403', 'loss_video': '0.2510', 'loss_audio': '0.4894', 'step': 1282, 'global_step': 11839} [2025-08-16 19:50:07] {'loss': '0.7701', 'loss_video': '0.3035', 'loss_audio': '0.4666', 'step': 1292, 'global_step': 11849} [2025-08-16 19:52:35] {'loss': '0.6247', 'loss_video': '0.2333', 'loss_audio': '0.3914', 'step': 1302, 'global_step': 11859} [2025-08-16 19:55:07] {'loss': '0.6645', 'loss_video': '0.2773', 'loss_audio': '0.3871', 'step': 1312, 'global_step': 11869} [2025-08-16 19:57:17] {'loss': '0.6688', 'loss_video': '0.2767', 'loss_audio': '0.3921', 'step': 1322, 'global_step': 11879} [2025-08-16 19:59:50] {'loss': '0.6646', 'loss_video': '0.2438', 'loss_audio': '0.4208', 'step': 1332, 'global_step': 11889} [2025-08-16 20:02:13] {'loss': '0.6889', 'loss_video': '0.2695', 'loss_audio': '0.4194', 'step': 1342, 'global_step': 11899} [2025-08-16 20:04:45] {'loss': '0.6701', 'loss_video': '0.2504', 'loss_audio': '0.4197', 'step': 1352, 'global_step': 11909} [2025-08-16 20:07:16] {'loss': '0.6773', 'loss_video': '0.2734', 'loss_audio': '0.4040', 'step': 1362, 'global_step': 11919} [2025-08-16 20:09:23] {'loss': '0.6362', 'loss_video': '0.2358', 'loss_audio': '0.4004', 'step': 1372, 'global_step': 11929} [2025-08-16 20:11:27] {'loss': '0.6665', 'loss_video': '0.2535', 'loss_audio': '0.4129', 'step': 1382, 'global_step': 11939} [2025-08-16 20:13:36] {'loss': '0.6915', 'loss_video': '0.2547', 'loss_audio': '0.4368', 'step': 1392, 'global_step': 11949} [2025-08-16 20:15:57] {'loss': '0.6562', 'loss_video': '0.2396', 'loss_audio': '0.4167', 'step': 1402, 'global_step': 11959} [2025-08-16 20:18:16] {'loss': '0.7165', 'loss_video': '0.2510', 'loss_audio': '0.4654', 'step': 1412, 'global_step': 11969} [2025-08-16 20:20:34] {'loss': '0.6417', 'loss_video': '0.2698', 'loss_audio': '0.3718', 'step': 1422, 'global_step': 11979} [2025-08-16 20:22:47] {'loss': '0.6300', 'loss_video': '0.2398', 'loss_audio': '0.3902', 'step': 1432, 'global_step': 11989} [2025-08-16 20:24:38] {'loss': '0.7249', 'loss_video': '0.2806', 'loss_audio': '0.4443', 'step': 1442, 'global_step': 11999} [2025-08-16 20:24:45] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 20:25:03] Saved checkpoint at epoch 1, step 1443, global_step 12000 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step12000 [2025-08-16 20:25:04] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step11500 has been deleted successfully as cfg.save_total_limit! [2025-08-16 20:27:44] {'loss': '0.6640', 'loss_video': '0.2477', 'loss_audio': '0.4164', 'step': 1452, 'global_step': 12009} [2025-08-16 20:30:04] {'loss': '0.7132', 'loss_video': '0.2489', 'loss_audio': '0.4643', 'step': 1462, 'global_step': 12019} [2025-08-16 20:32:20] {'loss': '0.6804', 'loss_video': '0.2460', 'loss_audio': '0.4345', 'step': 1472, 'global_step': 12029} [2025-08-16 20:35:01] {'loss': '0.7009', 'loss_video': '0.2936', 'loss_audio': '0.4073', 'step': 1482, 'global_step': 12039} [2025-08-16 20:37:19] {'loss': '0.6844', 'loss_video': '0.2678', 'loss_audio': '0.4166', 'step': 1492, 'global_step': 12049} [2025-08-16 20:39:24] {'loss': '0.6437', 'loss_video': '0.2398', 'loss_audio': '0.4038', 'step': 1502, 'global_step': 12059} [2025-08-16 20:41:49] {'loss': '0.7247', 'loss_video': '0.2848', 'loss_audio': '0.4399', 'step': 1512, 'global_step': 12069} [2025-08-16 20:44:25] {'loss': '0.8071', 'loss_video': '0.3084', 'loss_audio': '0.4987', 'step': 1522, 'global_step': 12079} [2025-08-16 20:46:59] {'loss': '0.6951', 'loss_video': '0.2748', 'loss_audio': '0.4203', 'step': 1532, 'global_step': 12089} [2025-08-16 20:49:22] {'loss': '0.7168', 'loss_video': '0.3002', 'loss_audio': '0.4165', 'step': 1542, 'global_step': 12099} [2025-08-16 20:51:56] {'loss': '0.6512', 'loss_video': '0.2684', 'loss_audio': '0.3828', 'step': 1552, 'global_step': 12109} [2025-08-16 20:54:07] {'loss': '0.7011', 'loss_video': '0.2867', 'loss_audio': '0.4144', 'step': 1562, 'global_step': 12119} [2025-08-16 20:56:46] {'loss': '0.6110', 'loss_video': '0.2355', 'loss_audio': '0.3754', 'step': 1572, 'global_step': 12129} [2025-08-16 20:59:25] {'loss': '0.7313', 'loss_video': '0.3015', 'loss_audio': '0.4297', 'step': 1582, 'global_step': 12139} [2025-08-16 21:02:00] {'loss': '0.6637', 'loss_video': '0.2360', 'loss_audio': '0.4277', 'step': 1592, 'global_step': 12149} [2025-08-16 21:04:41] {'loss': '0.6929', 'loss_video': '0.2653', 'loss_audio': '0.4276', 'step': 1602, 'global_step': 12159} [2025-08-16 21:07:30] {'loss': '0.6415', 'loss_video': '0.2364', 'loss_audio': '0.4051', 'step': 1612, 'global_step': 12169} [2025-08-16 21:10:20] {'loss': '0.6293', 'loss_video': '0.2540', 'loss_audio': '0.3753', 'step': 1622, 'global_step': 12179} [2025-08-16 21:13:01] {'loss': '0.6960', 'loss_video': '0.2702', 'loss_audio': '0.4259', 'step': 1632, 'global_step': 12189} [2025-08-16 21:15:34] {'loss': '0.6341', 'loss_video': '0.2482', 'loss_audio': '0.3859', 'step': 1642, 'global_step': 12199} [2025-08-16 21:18:06] {'loss': '0.7014', 'loss_video': '0.2559', 'loss_audio': '0.4455', 'step': 1652, 'global_step': 12209} [2025-08-16 21:20:34] {'loss': '0.7060', 'loss_video': '0.2838', 'loss_audio': '0.4222', 'step': 1662, 'global_step': 12219} [2025-08-16 21:23:01] {'loss': '0.7375', 'loss_video': '0.2947', 'loss_audio': '0.4429', 'step': 1672, 'global_step': 12229} [2025-08-16 21:25:22] {'loss': '0.6878', 'loss_video': '0.2531', 'loss_audio': '0.4347', 'step': 1682, 'global_step': 12239} [2025-08-16 21:27:25] {'loss': '0.7015', 'loss_video': '0.2676', 'loss_audio': '0.4339', 'step': 1692, 'global_step': 12249} [2025-08-16 21:27:31] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 21:27:50] Saved checkpoint at epoch 1, step 1693, global_step 12250 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step12250 [2025-08-16 21:27:50] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step11750 has been deleted successfully as cfg.save_total_limit! [2025-08-16 21:30:26] {'loss': '0.6904', 'loss_video': '0.2753', 'loss_audio': '0.4151', 'step': 1702, 'global_step': 12259} [2025-08-16 21:33:22] {'loss': '0.6722', 'loss_video': '0.2651', 'loss_audio': '0.4071', 'step': 1712, 'global_step': 12269} [2025-08-16 21:36:00] {'loss': '0.6847', 'loss_video': '0.2723', 'loss_audio': '0.4124', 'step': 1722, 'global_step': 12279} [2025-08-16 21:38:34] {'loss': '0.7081', 'loss_video': '0.3037', 'loss_audio': '0.4044', 'step': 1732, 'global_step': 12289} [2025-08-16 21:41:20] {'loss': '0.6736', 'loss_video': '0.2697', 'loss_audio': '0.4040', 'step': 1742, 'global_step': 12299} [2025-08-16 21:43:43] {'loss': '0.6598', 'loss_video': '0.2660', 'loss_audio': '0.3938', 'step': 1752, 'global_step': 12309} [2025-08-16 21:46:08] {'loss': '0.6654', 'loss_video': '0.2544', 'loss_audio': '0.4109', 'step': 1762, 'global_step': 12319} [2025-08-16 21:48:33] {'loss': '0.6974', 'loss_video': '0.2637', 'loss_audio': '0.4337', 'step': 1772, 'global_step': 12329} [2025-08-16 21:50:56] {'loss': '0.6558', 'loss_video': '0.2687', 'loss_audio': '0.3871', 'step': 1782, 'global_step': 12339} [2025-08-16 21:53:45] {'loss': '0.7112', 'loss_video': '0.2636', 'loss_audio': '0.4476', 'step': 1792, 'global_step': 12349} [2025-08-16 21:55:56] {'loss': '0.6343', 'loss_video': '0.2445', 'loss_audio': '0.3898', 'step': 1802, 'global_step': 12359} [2025-08-16 21:58:20] {'loss': '0.7165', 'loss_video': '0.2849', 'loss_audio': '0.4316', 'step': 1812, 'global_step': 12369} [2025-08-16 22:00:49] {'loss': '0.6700', 'loss_video': '0.2418', 'loss_audio': '0.4282', 'step': 1822, 'global_step': 12379} [2025-08-16 22:03:28] {'loss': '0.6705', 'loss_video': '0.2434', 'loss_audio': '0.4271', 'step': 1832, 'global_step': 12389} [2025-08-16 22:05:51] {'loss': '0.6962', 'loss_video': '0.2576', 'loss_audio': '0.4387', 'step': 1842, 'global_step': 12399} [2025-08-16 22:08:03] {'loss': '0.6638', 'loss_video': '0.2593', 'loss_audio': '0.4044', 'step': 1852, 'global_step': 12409} [2025-08-16 22:10:27] {'loss': '0.6787', 'loss_video': '0.2712', 'loss_audio': '0.4075', 'step': 1862, 'global_step': 12419} [2025-08-16 22:13:03] {'loss': '0.6582', 'loss_video': '0.2574', 'loss_audio': '0.4008', 'step': 1872, 'global_step': 12429} [2025-08-16 22:15:43] {'loss': '0.6870', 'loss_video': '0.2662', 'loss_audio': '0.4208', 'step': 1882, 'global_step': 12439} [2025-08-16 22:18:06] {'loss': '0.7051', 'loss_video': '0.2698', 'loss_audio': '0.4353', 'step': 1892, 'global_step': 12449} [2025-08-16 22:20:50] {'loss': '0.7618', 'loss_video': '0.2986', 'loss_audio': '0.4633', 'step': 1902, 'global_step': 12459} [2025-08-16 22:23:16] {'loss': '0.6652', 'loss_video': '0.2609', 'loss_audio': '0.4043', 'step': 1912, 'global_step': 12469} [2025-08-16 22:25:40] {'loss': '0.6901', 'loss_video': '0.2558', 'loss_audio': '0.4343', 'step': 1922, 'global_step': 12479} [2025-08-16 22:28:06] {'loss': '0.6980', 'loss_video': '0.2666', 'loss_audio': '0.4314', 'step': 1932, 'global_step': 12489} [2025-08-16 22:30:36] {'loss': '0.7410', 'loss_video': '0.2817', 'loss_audio': '0.4593', 'step': 1942, 'global_step': 12499} [2025-08-16 22:30:43] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 22:31:02] Saved checkpoint at epoch 1, step 1943, global_step 12500 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step12500 [2025-08-16 22:31:02] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step12000 has been deleted successfully as cfg.save_total_limit! [2025-08-16 22:33:30] {'loss': '0.6621', 'loss_video': '0.2505', 'loss_audio': '0.4116', 'step': 1952, 'global_step': 12509} [2025-08-16 22:36:03] {'loss': '0.6870', 'loss_video': '0.2621', 'loss_audio': '0.4249', 'step': 1962, 'global_step': 12519} [2025-08-16 22:38:19] {'loss': '0.6749', 'loss_video': '0.2624', 'loss_audio': '0.4126', 'step': 1972, 'global_step': 12529} [2025-08-16 22:40:51] {'loss': '0.6957', 'loss_video': '0.2782', 'loss_audio': '0.4175', 'step': 1982, 'global_step': 12539} [2025-08-16 22:43:25] {'loss': '0.6635', 'loss_video': '0.2447', 'loss_audio': '0.4188', 'step': 1992, 'global_step': 12549} [2025-08-16 22:46:01] {'loss': '0.6710', 'loss_video': '0.2533', 'loss_audio': '0.4177', 'step': 2002, 'global_step': 12559} [2025-08-16 22:48:30] {'loss': '0.6642', 'loss_video': '0.2648', 'loss_audio': '0.3994', 'step': 2012, 'global_step': 12569} [2025-08-16 22:50:57] {'loss': '0.7171', 'loss_video': '0.2859', 'loss_audio': '0.4312', 'step': 2022, 'global_step': 12579} [2025-08-16 22:53:19] {'loss': '0.6913', 'loss_video': '0.2476', 'loss_audio': '0.4437', 'step': 2032, 'global_step': 12589} [2025-08-16 22:55:33] {'loss': '0.6353', 'loss_video': '0.2531', 'loss_audio': '0.3822', 'step': 2042, 'global_step': 12599} [2025-08-16 22:57:46] {'loss': '0.6440', 'loss_video': '0.2447', 'loss_audio': '0.3993', 'step': 2052, 'global_step': 12609} [2025-08-16 23:00:04] {'loss': '0.6251', 'loss_video': '0.2462', 'loss_audio': '0.3789', 'step': 2062, 'global_step': 12619} [2025-08-16 23:02:26] {'loss': '0.7013', 'loss_video': '0.2709', 'loss_audio': '0.4304', 'step': 2072, 'global_step': 12629} [2025-08-16 23:04:52] {'loss': '0.6337', 'loss_video': '0.2384', 'loss_audio': '0.3953', 'step': 2082, 'global_step': 12639} [2025-08-16 23:07:32] {'loss': '0.7231', 'loss_video': '0.3121', 'loss_audio': '0.4110', 'step': 2092, 'global_step': 12649} [2025-08-16 23:10:05] {'loss': '0.6940', 'loss_video': '0.2766', 'loss_audio': '0.4174', 'step': 2102, 'global_step': 12659} [2025-08-16 23:12:30] {'loss': '0.7092', 'loss_video': '0.2320', 'loss_audio': '0.4772', 'step': 2112, 'global_step': 12669} [2025-08-16 23:14:49] {'loss': '0.6905', 'loss_video': '0.2653', 'loss_audio': '0.4252', 'step': 2122, 'global_step': 12679} [2025-08-16 23:17:15] {'loss': '0.6601', 'loss_video': '0.2367', 'loss_audio': '0.4234', 'step': 2132, 'global_step': 12689} [2025-08-16 23:19:41] {'loss': '0.7935', 'loss_video': '0.2600', 'loss_audio': '0.5335', 'step': 2142, 'global_step': 12699} [2025-08-16 23:22:07] {'loss': '0.6746', 'loss_video': '0.2601', 'loss_audio': '0.4145', 'step': 2152, 'global_step': 12709} [2025-08-16 23:24:35] {'loss': '0.7754', 'loss_video': '0.2620', 'loss_audio': '0.5135', 'step': 2162, 'global_step': 12719} [2025-08-16 23:27:02] {'loss': '0.6870', 'loss_video': '0.2637', 'loss_audio': '0.4233', 'step': 2172, 'global_step': 12729} [2025-08-16 23:29:46] {'loss': '0.7822', 'loss_video': '0.3008', 'loss_audio': '0.4814', 'step': 2182, 'global_step': 12739} [2025-08-16 23:32:13] {'loss': '0.7280', 'loss_video': '0.2593', 'loss_audio': '0.4686', 'step': 2192, 'global_step': 12749} [2025-08-16 23:32:19] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-16 23:32:38] Saved checkpoint at epoch 1, step 2193, global_step 12750 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step12750 [2025-08-16 23:32:38] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step12250 has been deleted successfully as cfg.save_total_limit! [2025-08-16 23:35:07] {'loss': '0.6508', 'loss_video': '0.2321', 'loss_audio': '0.4187', 'step': 2202, 'global_step': 12759} [2025-08-16 23:37:38] {'loss': '0.6595', 'loss_video': '0.2660', 'loss_audio': '0.3935', 'step': 2212, 'global_step': 12769} [2025-08-16 23:39:57] {'loss': '0.7237', 'loss_video': '0.2725', 'loss_audio': '0.4512', 'step': 2222, 'global_step': 12779} [2025-08-16 23:42:13] {'loss': '0.7212', 'loss_video': '0.2646', 'loss_audio': '0.4567', 'step': 2232, 'global_step': 12789} [2025-08-16 23:44:27] {'loss': '0.7607', 'loss_video': '0.3033', 'loss_audio': '0.4574', 'step': 2242, 'global_step': 12799} [2025-08-16 23:47:00] {'loss': '0.6728', 'loss_video': '0.2554', 'loss_audio': '0.4174', 'step': 2252, 'global_step': 12809} [2025-08-16 23:49:19] {'loss': '0.6559', 'loss_video': '0.2291', 'loss_audio': '0.4268', 'step': 2262, 'global_step': 12819} [2025-08-16 23:51:28] {'loss': '0.7064', 'loss_video': '0.2688', 'loss_audio': '0.4376', 'step': 2272, 'global_step': 12829} [2025-08-16 23:54:13] {'loss': '0.6349', 'loss_video': '0.2443', 'loss_audio': '0.3907', 'step': 2282, 'global_step': 12839} [2025-08-16 23:56:20] {'loss': '0.6806', 'loss_video': '0.2634', 'loss_audio': '0.4172', 'step': 2292, 'global_step': 12849} [2025-08-16 23:58:47] {'loss': '0.7084', 'loss_video': '0.2734', 'loss_audio': '0.4350', 'step': 2302, 'global_step': 12859} [2025-08-17 00:01:11] {'loss': '0.6936', 'loss_video': '0.2827', 'loss_audio': '0.4109', 'step': 2312, 'global_step': 12869} [2025-08-17 00:03:32] {'loss': '0.7250', 'loss_video': '0.2838', 'loss_audio': '0.4412', 'step': 2322, 'global_step': 12879} [2025-08-17 00:05:48] {'loss': '0.6849', 'loss_video': '0.2600', 'loss_audio': '0.4248', 'step': 2332, 'global_step': 12889} [2025-08-17 00:08:20] {'loss': '0.5999', 'loss_video': '0.2185', 'loss_audio': '0.3814', 'step': 2342, 'global_step': 12899} [2025-08-17 00:10:50] {'loss': '0.6716', 'loss_video': '0.2662', 'loss_audio': '0.4053', 'step': 2352, 'global_step': 12909} [2025-08-17 00:13:02] {'loss': '0.6546', 'loss_video': '0.2256', 'loss_audio': '0.4290', 'step': 2362, 'global_step': 12919} [2025-08-17 00:15:28] {'loss': '0.6476', 'loss_video': '0.2642', 'loss_audio': '0.3834', 'step': 2372, 'global_step': 12929} [2025-08-17 00:17:50] {'loss': '0.6855', 'loss_video': '0.2544', 'loss_audio': '0.4311', 'step': 2382, 'global_step': 12939} [2025-08-17 00:20:20] {'loss': '0.6713', 'loss_video': '0.2749', 'loss_audio': '0.3964', 'step': 2392, 'global_step': 12949} [2025-08-17 00:22:57] {'loss': '0.6766', 'loss_video': '0.2412', 'loss_audio': '0.4354', 'step': 2402, 'global_step': 12959} [2025-08-17 00:25:13] {'loss': '0.7532', 'loss_video': '0.3205', 'loss_audio': '0.4327', 'step': 2412, 'global_step': 12969} [2025-08-17 00:27:34] {'loss': '0.6289', 'loss_video': '0.2399', 'loss_audio': '0.3890', 'step': 2422, 'global_step': 12979} [2025-08-17 00:29:52] {'loss': '0.7435', 'loss_video': '0.2729', 'loss_audio': '0.4707', 'step': 2432, 'global_step': 12989} [2025-08-17 00:32:26] {'loss': '0.6651', 'loss_video': '0.2425', 'loss_audio': '0.4225', 'step': 2442, 'global_step': 12999} [2025-08-17 00:32:33] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 00:32:52] Saved checkpoint at epoch 1, step 2443, global_step 13000 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step13000 [2025-08-17 00:32:52] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step12500 has been deleted successfully as cfg.save_total_limit! [2025-08-17 00:35:03] {'loss': '0.6529', 'loss_video': '0.2435', 'loss_audio': '0.4094', 'step': 2452, 'global_step': 13009} [2025-08-17 00:37:45] {'loss': '0.7474', 'loss_video': '0.3169', 'loss_audio': '0.4304', 'step': 2462, 'global_step': 13019} [2025-08-17 00:40:15] {'loss': '0.6769', 'loss_video': '0.2653', 'loss_audio': '0.4116', 'step': 2472, 'global_step': 13029} [2025-08-17 00:42:16] {'loss': '0.7117', 'loss_video': '0.2528', 'loss_audio': '0.4590', 'step': 2482, 'global_step': 13039} [2025-08-17 00:44:25] {'loss': '0.6689', 'loss_video': '0.2534', 'loss_audio': '0.4155', 'step': 2492, 'global_step': 13049} [2025-08-17 00:47:19] {'loss': '0.7234', 'loss_video': '0.2943', 'loss_audio': '0.4291', 'step': 2502, 'global_step': 13059} [2025-08-17 00:50:04] {'loss': '0.7498', 'loss_video': '0.3216', 'loss_audio': '0.4282', 'step': 2512, 'global_step': 13069} [2025-08-17 00:52:28] {'loss': '0.7205', 'loss_video': '0.2888', 'loss_audio': '0.4317', 'step': 2522, 'global_step': 13079} [2025-08-17 00:54:47] {'loss': '0.6408', 'loss_video': '0.2534', 'loss_audio': '0.3874', 'step': 2532, 'global_step': 13089} [2025-08-17 00:57:03] {'loss': '0.6882', 'loss_video': '0.2718', 'loss_audio': '0.4165', 'step': 2542, 'global_step': 13099} [2025-08-17 00:59:33] {'loss': '0.6880', 'loss_video': '0.2803', 'loss_audio': '0.4077', 'step': 2552, 'global_step': 13109} [2025-08-17 01:01:41] {'loss': '0.7122', 'loss_video': '0.2729', 'loss_audio': '0.4393', 'step': 2562, 'global_step': 13119} [2025-08-17 01:04:04] {'loss': '0.6690', 'loss_video': '0.2449', 'loss_audio': '0.4241', 'step': 2572, 'global_step': 13129} [2025-08-17 01:06:31] {'loss': '0.6468', 'loss_video': '0.2411', 'loss_audio': '0.4057', 'step': 2582, 'global_step': 13139} [2025-08-17 01:09:04] {'loss': '0.7034', 'loss_video': '0.2523', 'loss_audio': '0.4511', 'step': 2592, 'global_step': 13149} [2025-08-17 01:11:56] {'loss': '0.7261', 'loss_video': '0.2925', 'loss_audio': '0.4337', 'step': 2602, 'global_step': 13159} [2025-08-17 01:14:49] {'loss': '0.6865', 'loss_video': '0.2596', 'loss_audio': '0.4269', 'step': 2612, 'global_step': 13169} [2025-08-17 01:17:14] {'loss': '0.6256', 'loss_video': '0.2376', 'loss_audio': '0.3879', 'step': 2622, 'global_step': 13179} [2025-08-17 01:19:44] {'loss': '0.6442', 'loss_video': '0.2400', 'loss_audio': '0.4043', 'step': 2632, 'global_step': 13189} [2025-08-17 01:21:57] {'loss': '0.6554', 'loss_video': '0.2543', 'loss_audio': '0.4011', 'step': 2642, 'global_step': 13199} [2025-08-17 01:24:15] {'loss': '0.6519', 'loss_video': '0.2431', 'loss_audio': '0.4088', 'step': 2652, 'global_step': 13209} [2025-08-17 01:26:37] {'loss': '0.6687', 'loss_video': '0.2359', 'loss_audio': '0.4328', 'step': 2662, 'global_step': 13219} [2025-08-17 01:28:43] {'loss': '0.7140', 'loss_video': '0.2821', 'loss_audio': '0.4319', 'step': 2672, 'global_step': 13229} [2025-08-17 01:31:10] {'loss': '0.6574', 'loss_video': '0.2487', 'loss_audio': '0.4087', 'step': 2682, 'global_step': 13239} [2025-08-17 01:33:37] {'loss': '0.6560', 'loss_video': '0.2596', 'loss_audio': '0.3964', 'step': 2692, 'global_step': 13249} [2025-08-17 01:33:44] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 01:34:04] Saved checkpoint at epoch 1, step 2693, global_step 13250 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step13250 [2025-08-17 01:34:04] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step12750 has been deleted successfully as cfg.save_total_limit! [2025-08-17 01:36:48] {'loss': '0.7182', 'loss_video': '0.2718', 'loss_audio': '0.4464', 'step': 2702, 'global_step': 13259} [2025-08-17 01:39:27] {'loss': '0.7153', 'loss_video': '0.2745', 'loss_audio': '0.4408', 'step': 2712, 'global_step': 13269} [2025-08-17 01:41:54] {'loss': '0.6954', 'loss_video': '0.2665', 'loss_audio': '0.4289', 'step': 2722, 'global_step': 13279} [2025-08-17 01:44:25] {'loss': '0.6310', 'loss_video': '0.2345', 'loss_audio': '0.3965', 'step': 2732, 'global_step': 13289} [2025-08-17 01:46:39] {'loss': '0.6724', 'loss_video': '0.2470', 'loss_audio': '0.4254', 'step': 2742, 'global_step': 13299} [2025-08-17 01:49:27] {'loss': '0.6880', 'loss_video': '0.2720', 'loss_audio': '0.4161', 'step': 2752, 'global_step': 13309} [2025-08-17 01:52:12] {'loss': '0.6964', 'loss_video': '0.2792', 'loss_audio': '0.4172', 'step': 2762, 'global_step': 13319} [2025-08-17 01:54:39] {'loss': '0.6701', 'loss_video': '0.2690', 'loss_audio': '0.4011', 'step': 2772, 'global_step': 13329} [2025-08-17 01:57:15] {'loss': '0.6298', 'loss_video': '0.2399', 'loss_audio': '0.3899', 'step': 2782, 'global_step': 13339} [2025-08-17 01:59:38] {'loss': '0.7013', 'loss_video': '0.2623', 'loss_audio': '0.4390', 'step': 2792, 'global_step': 13349} [2025-08-17 02:02:19] {'loss': '0.6504', 'loss_video': '0.2410', 'loss_audio': '0.4094', 'step': 2802, 'global_step': 13359} [2025-08-17 02:04:36] {'loss': '0.6284', 'loss_video': '0.2330', 'loss_audio': '0.3954', 'step': 2812, 'global_step': 13369} [2025-08-17 02:07:09] {'loss': '0.6428', 'loss_video': '0.2190', 'loss_audio': '0.4238', 'step': 2822, 'global_step': 13379} [2025-08-17 02:09:42] {'loss': '0.7277', 'loss_video': '0.2676', 'loss_audio': '0.4601', 'step': 2832, 'global_step': 13389} [2025-08-17 02:12:23] {'loss': '0.7300', 'loss_video': '0.2806', 'loss_audio': '0.4494', 'step': 2842, 'global_step': 13399} [2025-08-17 02:14:55] {'loss': '0.6486', 'loss_video': '0.2471', 'loss_audio': '0.4015', 'step': 2852, 'global_step': 13409} [2025-08-17 02:17:16] {'loss': '0.7538', 'loss_video': '0.3031', 'loss_audio': '0.4508', 'step': 2862, 'global_step': 13419} [2025-08-17 02:19:49] {'loss': '0.6413', 'loss_video': '0.2434', 'loss_audio': '0.3979', 'step': 2872, 'global_step': 13429} [2025-08-17 02:22:01] {'loss': '0.6751', 'loss_video': '0.2543', 'loss_audio': '0.4208', 'step': 2882, 'global_step': 13439} [2025-08-17 02:24:42] {'loss': '0.6809', 'loss_video': '0.2872', 'loss_audio': '0.3937', 'step': 2892, 'global_step': 13449} [2025-08-17 02:27:06] {'loss': '0.6590', 'loss_video': '0.2712', 'loss_audio': '0.3878', 'step': 2902, 'global_step': 13459} [2025-08-17 02:29:35] {'loss': '0.7208', 'loss_video': '0.2934', 'loss_audio': '0.4274', 'step': 2912, 'global_step': 13469} [2025-08-17 02:31:56] {'loss': '0.6846', 'loss_video': '0.2684', 'loss_audio': '0.4162', 'step': 2922, 'global_step': 13479} [2025-08-17 02:34:24] {'loss': '0.7349', 'loss_video': '0.2830', 'loss_audio': '0.4519', 'step': 2932, 'global_step': 13489} [2025-08-17 02:36:53] {'loss': '0.6665', 'loss_video': '0.2507', 'loss_audio': '0.4158', 'step': 2942, 'global_step': 13499} [2025-08-17 02:37:01] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 02:37:22] Saved checkpoint at epoch 1, step 2943, global_step 13500 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step13500 [2025-08-17 02:37:22] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step13000 has been deleted successfully as cfg.save_total_limit! [2025-08-17 02:40:00] {'loss': '0.6941', 'loss_video': '0.2746', 'loss_audio': '0.4196', 'step': 2952, 'global_step': 13509} [2025-08-17 02:42:40] {'loss': '0.6693', 'loss_video': '0.2496', 'loss_audio': '0.4197', 'step': 2962, 'global_step': 13519} [2025-08-17 02:45:06] {'loss': '0.6695', 'loss_video': '0.2563', 'loss_audio': '0.4132', 'step': 2972, 'global_step': 13529} [2025-08-17 02:47:33] {'loss': '0.7400', 'loss_video': '0.2764', 'loss_audio': '0.4636', 'step': 2982, 'global_step': 13539} [2025-08-17 02:49:54] {'loss': '0.6924', 'loss_video': '0.2417', 'loss_audio': '0.4507', 'step': 2992, 'global_step': 13549} [2025-08-17 02:52:21] {'loss': '0.6994', 'loss_video': '0.2660', 'loss_audio': '0.4333', 'step': 3002, 'global_step': 13559} [2025-08-17 02:54:35] {'loss': '0.6826', 'loss_video': '0.2439', 'loss_audio': '0.4387', 'step': 3012, 'global_step': 13569} [2025-08-17 02:56:55] {'loss': '0.6868', 'loss_video': '0.2917', 'loss_audio': '0.3951', 'step': 3022, 'global_step': 13579} [2025-08-17 02:59:32] {'loss': '0.6905', 'loss_video': '0.2403', 'loss_audio': '0.4501', 'step': 3032, 'global_step': 13589} [2025-08-17 03:02:08] {'loss': '0.6907', 'loss_video': '0.2625', 'loss_audio': '0.4281', 'step': 3042, 'global_step': 13599} [2025-08-17 03:04:49] {'loss': '0.7115', 'loss_video': '0.2847', 'loss_audio': '0.4268', 'step': 3052, 'global_step': 13609} [2025-08-17 03:07:32] {'loss': '0.7454', 'loss_video': '0.2917', 'loss_audio': '0.4538', 'step': 3062, 'global_step': 13619} [2025-08-17 03:09:46] {'loss': '0.7391', 'loss_video': '0.2643', 'loss_audio': '0.4748', 'step': 3072, 'global_step': 13629} [2025-08-17 03:12:08] {'loss': '0.7294', 'loss_video': '0.2995', 'loss_audio': '0.4299', 'step': 3082, 'global_step': 13639} [2025-08-17 03:14:26] {'loss': '0.7197', 'loss_video': '0.2802', 'loss_audio': '0.4395', 'step': 3092, 'global_step': 13649} [2025-08-17 03:16:55] {'loss': '0.6854', 'loss_video': '0.2607', 'loss_audio': '0.4248', 'step': 3102, 'global_step': 13659} [2025-08-17 03:19:04] {'loss': '0.6228', 'loss_video': '0.2421', 'loss_audio': '0.3807', 'step': 3112, 'global_step': 13669} [2025-08-17 03:21:47] {'loss': '0.6140', 'loss_video': '0.2177', 'loss_audio': '0.3963', 'step': 3122, 'global_step': 13679} [2025-08-17 03:24:28] {'loss': '0.7078', 'loss_video': '0.2754', 'loss_audio': '0.4325', 'step': 3132, 'global_step': 13689} [2025-08-17 03:27:07] {'loss': '0.6414', 'loss_video': '0.2475', 'loss_audio': '0.3939', 'step': 3142, 'global_step': 13699} [2025-08-17 03:29:16] {'loss': '0.6547', 'loss_video': '0.2444', 'loss_audio': '0.4103', 'step': 3152, 'global_step': 13709} [2025-08-17 03:31:45] {'loss': '0.6922', 'loss_video': '0.2693', 'loss_audio': '0.4230', 'step': 3162, 'global_step': 13719} [2025-08-17 03:34:02] {'loss': '0.6822', 'loss_video': '0.2447', 'loss_audio': '0.4376', 'step': 3172, 'global_step': 13729} [2025-08-17 03:36:25] {'loss': '0.6348', 'loss_video': '0.2597', 'loss_audio': '0.3751', 'step': 3182, 'global_step': 13739} [2025-08-17 03:39:05] {'loss': '0.6777', 'loss_video': '0.2688', 'loss_audio': '0.4090', 'step': 3192, 'global_step': 13749} [2025-08-17 03:39:12] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 03:39:31] Saved checkpoint at epoch 1, step 3193, global_step 13750 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step13750 [2025-08-17 03:39:31] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step13250 has been deleted successfully as cfg.save_total_limit! [2025-08-17 03:41:51] {'loss': '0.6969', 'loss_video': '0.2672', 'loss_audio': '0.4298', 'step': 3202, 'global_step': 13759} [2025-08-17 03:44:06] {'loss': '0.6661', 'loss_video': '0.2622', 'loss_audio': '0.4040', 'step': 3212, 'global_step': 13769} [2025-08-17 03:46:47] {'loss': '0.7392', 'loss_video': '0.2961', 'loss_audio': '0.4431', 'step': 3222, 'global_step': 13779} [2025-08-17 03:49:31] {'loss': '0.6951', 'loss_video': '0.2649', 'loss_audio': '0.4302', 'step': 3232, 'global_step': 13789} [2025-08-17 03:52:00] {'loss': '0.6727', 'loss_video': '0.2648', 'loss_audio': '0.4080', 'step': 3242, 'global_step': 13799} [2025-08-17 03:54:26] {'loss': '0.6649', 'loss_video': '0.2464', 'loss_audio': '0.4186', 'step': 3252, 'global_step': 13809} [2025-08-17 03:56:56] {'loss': '0.7273', 'loss_video': '0.2728', 'loss_audio': '0.4545', 'step': 3262, 'global_step': 13819} [2025-08-17 03:59:43] {'loss': '0.6538', 'loss_video': '0.2574', 'loss_audio': '0.3964', 'step': 3272, 'global_step': 13829} [2025-08-17 04:02:21] {'loss': '0.6446', 'loss_video': '0.2574', 'loss_audio': '0.3872', 'step': 3282, 'global_step': 13839} [2025-08-17 04:04:52] {'loss': '0.7339', 'loss_video': '0.2750', 'loss_audio': '0.4588', 'step': 3292, 'global_step': 13849} [2025-08-17 04:07:33] {'loss': '0.7088', 'loss_video': '0.2822', 'loss_audio': '0.4266', 'step': 3302, 'global_step': 13859} [2025-08-17 04:09:50] {'loss': '0.7047', 'loss_video': '0.2667', 'loss_audio': '0.4380', 'step': 3312, 'global_step': 13869} [2025-08-17 04:12:08] {'loss': '0.6893', 'loss_video': '0.2467', 'loss_audio': '0.4426', 'step': 3322, 'global_step': 13879} [2025-08-17 04:14:23] {'loss': '0.6728', 'loss_video': '0.2596', 'loss_audio': '0.4131', 'step': 3332, 'global_step': 13889} [2025-08-17 04:17:01] {'loss': '0.6066', 'loss_video': '0.2151', 'loss_audio': '0.3915', 'step': 3342, 'global_step': 13899} [2025-08-17 04:19:55] {'loss': '0.7131', 'loss_video': '0.2973', 'loss_audio': '0.4158', 'step': 3352, 'global_step': 13909} [2025-08-17 04:22:24] {'loss': '0.6703', 'loss_video': '0.2754', 'loss_audio': '0.3949', 'step': 3362, 'global_step': 13919} [2025-08-17 04:25:07] {'loss': '0.6946', 'loss_video': '0.2616', 'loss_audio': '0.4329', 'step': 3372, 'global_step': 13929} [2025-08-17 04:27:53] {'loss': '0.6637', 'loss_video': '0.2691', 'loss_audio': '0.3946', 'step': 3382, 'global_step': 13939} [2025-08-17 04:30:29] {'loss': '0.7122', 'loss_video': '0.2685', 'loss_audio': '0.4437', 'step': 3392, 'global_step': 13949} [2025-08-17 04:33:06] {'loss': '0.6951', 'loss_video': '0.2873', 'loss_audio': '0.4077', 'step': 3402, 'global_step': 13959} [2025-08-17 04:35:35] {'loss': '0.6610', 'loss_video': '0.2725', 'loss_audio': '0.3885', 'step': 3412, 'global_step': 13969} [2025-08-17 04:38:03] {'loss': '0.7447', 'loss_video': '0.2677', 'loss_audio': '0.4770', 'step': 3422, 'global_step': 13979} [2025-08-17 04:40:25] {'loss': '0.6087', 'loss_video': '0.2260', 'loss_audio': '0.3827', 'step': 3432, 'global_step': 13989} [2025-08-17 04:42:42] {'loss': '0.7216', 'loss_video': '0.2949', 'loss_audio': '0.4267', 'step': 3442, 'global_step': 13999} [2025-08-17 04:42:48] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 04:43:07] Saved checkpoint at epoch 1, step 3443, global_step 14000 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step14000 [2025-08-17 04:43:08] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step13500 has been deleted successfully as cfg.save_total_limit! [2025-08-17 04:45:23] {'loss': '0.6784', 'loss_video': '0.2754', 'loss_audio': '0.4030', 'step': 3452, 'global_step': 14009} [2025-08-17 04:47:41] {'loss': '0.6482', 'loss_video': '0.2376', 'loss_audio': '0.4106', 'step': 3462, 'global_step': 14019} [2025-08-17 04:50:22] {'loss': '0.6756', 'loss_video': '0.2652', 'loss_audio': '0.4103', 'step': 3472, 'global_step': 14029} [2025-08-17 04:52:50] {'loss': '0.7040', 'loss_video': '0.2699', 'loss_audio': '0.4340', 'step': 3482, 'global_step': 14039} [2025-08-17 04:55:22] {'loss': '0.7133', 'loss_video': '0.2971', 'loss_audio': '0.4163', 'step': 3492, 'global_step': 14049} [2025-08-17 04:57:33] {'loss': '0.6925', 'loss_video': '0.2746', 'loss_audio': '0.4180', 'step': 3502, 'global_step': 14059} [2025-08-17 05:00:11] {'loss': '0.7233', 'loss_video': '0.2822', 'loss_audio': '0.4411', 'step': 3512, 'global_step': 14069} [2025-08-17 05:02:49] {'loss': '0.6959', 'loss_video': '0.2751', 'loss_audio': '0.4208', 'step': 3522, 'global_step': 14079} [2025-08-17 05:05:28] {'loss': '0.7544', 'loss_video': '0.2926', 'loss_audio': '0.4619', 'step': 3532, 'global_step': 14089} [2025-08-17 05:07:57] {'loss': '0.7141', 'loss_video': '0.2690', 'loss_audio': '0.4452', 'step': 3542, 'global_step': 14099} [2025-08-17 05:10:36] {'loss': '0.7385', 'loss_video': '0.2579', 'loss_audio': '0.4806', 'step': 3552, 'global_step': 14109} [2025-08-17 05:13:06] {'loss': '0.6821', 'loss_video': '0.2600', 'loss_audio': '0.4221', 'step': 3562, 'global_step': 14119} [2025-08-17 05:15:40] {'loss': '0.7294', 'loss_video': '0.2797', 'loss_audio': '0.4497', 'step': 3572, 'global_step': 14129} [2025-08-17 05:18:16] {'loss': '0.7650', 'loss_video': '0.2777', 'loss_audio': '0.4873', 'step': 3582, 'global_step': 14139} [2025-08-17 05:20:51] {'loss': '0.6764', 'loss_video': '0.2729', 'loss_audio': '0.4035', 'step': 3592, 'global_step': 14149} [2025-08-17 05:23:17] {'loss': '0.6575', 'loss_video': '0.2570', 'loss_audio': '0.4004', 'step': 3602, 'global_step': 14159} [2025-08-17 05:25:46] {'loss': '0.6617', 'loss_video': '0.2542', 'loss_audio': '0.4074', 'step': 3612, 'global_step': 14169} [2025-08-17 05:28:14] {'loss': '0.6438', 'loss_video': '0.2456', 'loss_audio': '0.3982', 'step': 3622, 'global_step': 14179} [2025-08-17 05:30:29] {'loss': '0.7322', 'loss_video': '0.2820', 'loss_audio': '0.4502', 'step': 3632, 'global_step': 14189} [2025-08-17 05:32:49] {'loss': '0.6628', 'loss_video': '0.2453', 'loss_audio': '0.4175', 'step': 3642, 'global_step': 14199} [2025-08-17 05:35:15] {'loss': '0.7169', 'loss_video': '0.2809', 'loss_audio': '0.4361', 'step': 3652, 'global_step': 14209} [2025-08-17 05:37:46] {'loss': '0.6721', 'loss_video': '0.2627', 'loss_audio': '0.4094', 'step': 3662, 'global_step': 14219} [2025-08-17 05:40:22] {'loss': '0.6880', 'loss_video': '0.2502', 'loss_audio': '0.4378', 'step': 3672, 'global_step': 14229} [2025-08-17 05:43:08] {'loss': '0.6773', 'loss_video': '0.2522', 'loss_audio': '0.4252', 'step': 3682, 'global_step': 14239} [2025-08-17 05:45:41] {'loss': '0.6729', 'loss_video': '0.2629', 'loss_audio': '0.4100', 'step': 3692, 'global_step': 14249} [2025-08-17 05:45:47] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 05:46:06] Saved checkpoint at epoch 1, step 3693, global_step 14250 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step14250 [2025-08-17 05:46:06] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step13750 has been deleted successfully as cfg.save_total_limit! [2025-08-17 05:48:42] {'loss': '0.7129', 'loss_video': '0.2766', 'loss_audio': '0.4363', 'step': 3702, 'global_step': 14259} [2025-08-17 05:51:26] {'loss': '0.6837', 'loss_video': '0.2830', 'loss_audio': '0.4007', 'step': 3712, 'global_step': 14269} [2025-08-17 05:53:52] {'loss': '0.7252', 'loss_video': '0.2989', 'loss_audio': '0.4262', 'step': 3722, 'global_step': 14279} [2025-08-17 05:56:23] {'loss': '0.6777', 'loss_video': '0.2828', 'loss_audio': '0.3949', 'step': 3732, 'global_step': 14289} [2025-08-17 05:58:51] {'loss': '0.7079', 'loss_video': '0.2569', 'loss_audio': '0.4510', 'step': 3742, 'global_step': 14299} [2025-08-17 06:01:30] {'loss': '0.6151', 'loss_video': '0.2240', 'loss_audio': '0.3911', 'step': 3752, 'global_step': 14309} [2025-08-17 06:03:59] {'loss': '0.7379', 'loss_video': '0.2739', 'loss_audio': '0.4640', 'step': 3762, 'global_step': 14319} [2025-08-17 06:06:24] {'loss': '0.6942', 'loss_video': '0.2685', 'loss_audio': '0.4257', 'step': 3772, 'global_step': 14329} [2025-08-17 06:08:29] {'loss': '0.6503', 'loss_video': '0.2293', 'loss_audio': '0.4210', 'step': 3782, 'global_step': 14339} [2025-08-17 06:11:04] {'loss': '0.6318', 'loss_video': '0.2513', 'loss_audio': '0.3805', 'step': 3792, 'global_step': 14349} [2025-08-17 06:13:26] {'loss': '0.6803', 'loss_video': '0.2816', 'loss_audio': '0.3987', 'step': 3802, 'global_step': 14359} [2025-08-17 06:15:57] {'loss': '0.7349', 'loss_video': '0.2893', 'loss_audio': '0.4456', 'step': 3812, 'global_step': 14369} [2025-08-17 06:17:59] {'loss': '0.6586', 'loss_video': '0.2503', 'loss_audio': '0.4083', 'step': 3822, 'global_step': 14379} [2025-08-17 06:20:33] {'loss': '0.6709', 'loss_video': '0.2484', 'loss_audio': '0.4225', 'step': 3832, 'global_step': 14389} [2025-08-17 06:23:05] {'loss': '0.7582', 'loss_video': '0.3142', 'loss_audio': '0.4440', 'step': 3842, 'global_step': 14399} [2025-08-17 06:25:45] {'loss': '0.6273', 'loss_video': '0.2487', 'loss_audio': '0.3786', 'step': 3852, 'global_step': 14409} [2025-08-17 06:28:22] {'loss': '0.7319', 'loss_video': '0.2844', 'loss_audio': '0.4475', 'step': 3862, 'global_step': 14419} [2025-08-17 06:30:54] {'loss': '0.7201', 'loss_video': '0.3017', 'loss_audio': '0.4184', 'step': 3872, 'global_step': 14429} [2025-08-17 06:33:13] {'loss': '0.6322', 'loss_video': '0.2363', 'loss_audio': '0.3960', 'step': 3882, 'global_step': 14439} [2025-08-17 06:35:25] {'loss': '0.7076', 'loss_video': '0.2752', 'loss_audio': '0.4324', 'step': 3892, 'global_step': 14449} [2025-08-17 06:37:21] {'loss': '0.6050', 'loss_video': '0.2256', 'loss_audio': '0.3794', 'step': 3902, 'global_step': 14459} [2025-08-17 06:39:22] {'loss': '0.7194', 'loss_video': '0.2824', 'loss_audio': '0.4369', 'step': 3912, 'global_step': 14469} [2025-08-17 06:41:58] {'loss': '0.7448', 'loss_video': '0.2760', 'loss_audio': '0.4688', 'step': 3922, 'global_step': 14479} [2025-08-17 06:44:31] {'loss': '0.6307', 'loss_video': '0.2473', 'loss_audio': '0.3833', 'step': 3932, 'global_step': 14489} [2025-08-17 06:47:24] {'loss': '0.6730', 'loss_video': '0.2717', 'loss_audio': '0.4013', 'step': 3942, 'global_step': 14499} [2025-08-17 06:47:31] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 06:47:49] Saved checkpoint at epoch 1, step 3943, global_step 14500 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step14500 [2025-08-17 06:47:50] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step14000 has been deleted successfully as cfg.save_total_limit! [2025-08-17 06:50:27] {'loss': '0.6454', 'loss_video': '0.2447', 'loss_audio': '0.4008', 'step': 3952, 'global_step': 14509} [2025-08-17 06:52:54] {'loss': '0.7019', 'loss_video': '0.2729', 'loss_audio': '0.4289', 'step': 3962, 'global_step': 14519} [2025-08-17 06:55:26] {'loss': '0.6736', 'loss_video': '0.2568', 'loss_audio': '0.4168', 'step': 3972, 'global_step': 14529} [2025-08-17 06:57:45] {'loss': '0.6921', 'loss_video': '0.2594', 'loss_audio': '0.4328', 'step': 3982, 'global_step': 14539} [2025-08-17 07:00:09] {'loss': '0.6789', 'loss_video': '0.2697', 'loss_audio': '0.4092', 'step': 3992, 'global_step': 14549} [2025-08-17 07:02:15] {'loss': '0.6569', 'loss_video': '0.2635', 'loss_audio': '0.3934', 'step': 4002, 'global_step': 14559} [2025-08-17 07:04:48] {'loss': '0.6613', 'loss_video': '0.2564', 'loss_audio': '0.4049', 'step': 4012, 'global_step': 14569} [2025-08-17 07:07:30] {'loss': '0.7129', 'loss_video': '0.3243', 'loss_audio': '0.3886', 'step': 4022, 'global_step': 14579} [2025-08-17 07:09:46] {'loss': '0.7137', 'loss_video': '0.2812', 'loss_audio': '0.4325', 'step': 4032, 'global_step': 14589} [2025-08-17 07:12:15] {'loss': '0.6627', 'loss_video': '0.2712', 'loss_audio': '0.3915', 'step': 4042, 'global_step': 14599} [2025-08-17 07:14:29] {'loss': '0.6420', 'loss_video': '0.2592', 'loss_audio': '0.3828', 'step': 4052, 'global_step': 14609} [2025-08-17 07:16:56] {'loss': '0.6573', 'loss_video': '0.2414', 'loss_audio': '0.4159', 'step': 4062, 'global_step': 14619} [2025-08-17 07:19:02] {'loss': '0.7223', 'loss_video': '0.2761', 'loss_audio': '0.4462', 'step': 4072, 'global_step': 14629} [2025-08-17 07:21:31] {'loss': '0.6890', 'loss_video': '0.2715', 'loss_audio': '0.4175', 'step': 4082, 'global_step': 14639} [2025-08-17 07:24:01] {'loss': '0.7126', 'loss_video': '0.2557', 'loss_audio': '0.4569', 'step': 4092, 'global_step': 14649} [2025-08-17 07:26:39] {'loss': '0.7020', 'loss_video': '0.2495', 'loss_audio': '0.4525', 'step': 4102, 'global_step': 14659} [2025-08-17 07:29:06] {'loss': '0.6799', 'loss_video': '0.2591', 'loss_audio': '0.4208', 'step': 4112, 'global_step': 14669} [2025-08-17 07:31:47] {'loss': '0.6393', 'loss_video': '0.2421', 'loss_audio': '0.3972', 'step': 4122, 'global_step': 14679} [2025-08-17 07:34:12] {'loss': '0.6544', 'loss_video': '0.2499', 'loss_audio': '0.4045', 'step': 4132, 'global_step': 14689} [2025-08-17 07:36:56] {'loss': '0.6876', 'loss_video': '0.2790', 'loss_audio': '0.4086', 'step': 4142, 'global_step': 14699} [2025-08-17 07:39:39] {'loss': '0.7196', 'loss_video': '0.3007', 'loss_audio': '0.4189', 'step': 4152, 'global_step': 14709} [2025-08-17 07:42:01] {'loss': '0.6750', 'loss_video': '0.2469', 'loss_audio': '0.4281', 'step': 4162, 'global_step': 14719} [2025-08-17 07:44:44] {'loss': '0.6538', 'loss_video': '0.2502', 'loss_audio': '0.4036', 'step': 4172, 'global_step': 14729} [2025-08-17 07:47:12] {'loss': '0.7415', 'loss_video': '0.2927', 'loss_audio': '0.4488', 'step': 4182, 'global_step': 14739} [2025-08-17 07:49:55] {'loss': '0.6484', 'loss_video': '0.2457', 'loss_audio': '0.4027', 'step': 4192, 'global_step': 14749} [2025-08-17 07:50:01] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 07:50:21] Saved checkpoint at epoch 1, step 4193, global_step 14750 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step14750 [2025-08-17 07:50:21] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step14250 has been deleted successfully as cfg.save_total_limit! [2025-08-17 07:53:10] {'loss': '0.7076', 'loss_video': '0.2882', 'loss_audio': '0.4194', 'step': 4202, 'global_step': 14759} [2025-08-17 07:55:32] {'loss': '0.6625', 'loss_video': '0.2550', 'loss_audio': '0.4076', 'step': 4212, 'global_step': 14769} [2025-08-17 07:58:12] {'loss': '0.7091', 'loss_video': '0.2547', 'loss_audio': '0.4544', 'step': 4222, 'global_step': 14779} [2025-08-17 08:00:40] {'loss': '0.6703', 'loss_video': '0.2746', 'loss_audio': '0.3958', 'step': 4232, 'global_step': 14789} [2025-08-17 08:03:19] {'loss': '0.6603', 'loss_video': '0.2439', 'loss_audio': '0.4164', 'step': 4242, 'global_step': 14799} [2025-08-17 08:05:34] {'loss': '0.7624', 'loss_video': '0.3361', 'loss_audio': '0.4262', 'step': 4252, 'global_step': 14809} [2025-08-17 08:08:04] {'loss': '0.6683', 'loss_video': '0.2598', 'loss_audio': '0.4085', 'step': 4262, 'global_step': 14819} [2025-08-17 08:10:30] {'loss': '0.6728', 'loss_video': '0.2525', 'loss_audio': '0.4203', 'step': 4272, 'global_step': 14829} [2025-08-17 08:12:45] {'loss': '0.6745', 'loss_video': '0.2730', 'loss_audio': '0.4015', 'step': 4282, 'global_step': 14839} [2025-08-17 08:15:25] {'loss': '0.6877', 'loss_video': '0.2909', 'loss_audio': '0.3968', 'step': 4292, 'global_step': 14849} [2025-08-17 08:17:45] {'loss': '0.6017', 'loss_video': '0.2444', 'loss_audio': '0.3573', 'step': 4302, 'global_step': 14859} [2025-08-17 08:20:22] {'loss': '0.7053', 'loss_video': '0.2618', 'loss_audio': '0.4435', 'step': 4312, 'global_step': 14869} [2025-08-17 08:23:12] {'loss': '0.6793', 'loss_video': '0.2627', 'loss_audio': '0.4166', 'step': 4322, 'global_step': 14879} [2025-08-17 08:25:39] {'loss': '0.6556', 'loss_video': '0.2552', 'loss_audio': '0.4004', 'step': 4332, 'global_step': 14889} [2025-08-17 08:27:54] {'loss': '0.6237', 'loss_video': '0.2343', 'loss_audio': '0.3894', 'step': 4342, 'global_step': 14899} [2025-08-17 08:30:37] {'loss': '0.6479', 'loss_video': '0.2470', 'loss_audio': '0.4009', 'step': 4352, 'global_step': 14909} [2025-08-17 08:33:09] {'loss': '0.6317', 'loss_video': '0.2600', 'loss_audio': '0.3718', 'step': 4362, 'global_step': 14919} [2025-08-17 08:35:30] {'loss': '0.7363', 'loss_video': '0.2807', 'loss_audio': '0.4557', 'step': 4372, 'global_step': 14929} [2025-08-17 08:37:43] {'loss': '0.6366', 'loss_video': '0.2346', 'loss_audio': '0.4021', 'step': 4382, 'global_step': 14939} [2025-08-17 08:40:22] {'loss': '0.6504', 'loss_video': '0.2478', 'loss_audio': '0.4025', 'step': 4392, 'global_step': 14949} [2025-08-17 08:42:35] {'loss': '0.6506', 'loss_video': '0.2444', 'loss_audio': '0.4062', 'step': 4402, 'global_step': 14959} [2025-08-17 08:45:19] {'loss': '0.6819', 'loss_video': '0.2672', 'loss_audio': '0.4147', 'step': 4412, 'global_step': 14969} [2025-08-17 08:47:35] {'loss': '0.6612', 'loss_video': '0.2483', 'loss_audio': '0.4129', 'step': 4422, 'global_step': 14979} [2025-08-17 08:50:03] {'loss': '0.6321', 'loss_video': '0.2525', 'loss_audio': '0.3796', 'step': 4432, 'global_step': 14989} [2025-08-17 08:52:45] {'loss': '0.6509', 'loss_video': '0.2462', 'loss_audio': '0.4047', 'step': 4442, 'global_step': 14999} [2025-08-17 08:52:52] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 08:53:10] Saved checkpoint at epoch 1, step 4443, global_step 15000 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step15000 [2025-08-17 08:53:11] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step14500 has been deleted successfully as cfg.save_total_limit! [2025-08-17 08:55:35] {'loss': '0.7049', 'loss_video': '0.2964', 'loss_audio': '0.4085', 'step': 4452, 'global_step': 15009} [2025-08-17 08:58:09] {'loss': '0.6265', 'loss_video': '0.2390', 'loss_audio': '0.3875', 'step': 4462, 'global_step': 15019} [2025-08-17 09:00:15] {'loss': '0.6528', 'loss_video': '0.2577', 'loss_audio': '0.3951', 'step': 4472, 'global_step': 15029} [2025-08-17 09:02:54] {'loss': '0.6764', 'loss_video': '0.2781', 'loss_audio': '0.3983', 'step': 4482, 'global_step': 15039} [2025-08-17 09:05:04] {'loss': '0.6928', 'loss_video': '0.2784', 'loss_audio': '0.4144', 'step': 4492, 'global_step': 15049} [2025-08-17 09:07:25] {'loss': '0.6450', 'loss_video': '0.2461', 'loss_audio': '0.3989', 'step': 4502, 'global_step': 15059} [2025-08-17 09:09:27] {'loss': '0.6519', 'loss_video': '0.2338', 'loss_audio': '0.4180', 'step': 4512, 'global_step': 15069} [2025-08-17 09:11:29] {'loss': '0.6896', 'loss_video': '0.2718', 'loss_audio': '0.4178', 'step': 4522, 'global_step': 15079} [2025-08-17 09:14:10] {'loss': '0.6963', 'loss_video': '0.2752', 'loss_audio': '0.4211', 'step': 4532, 'global_step': 15089} [2025-08-17 09:16:45] {'loss': '0.7076', 'loss_video': '0.2850', 'loss_audio': '0.4226', 'step': 4542, 'global_step': 15099} [2025-08-17 09:19:21] {'loss': '0.6486', 'loss_video': '0.2577', 'loss_audio': '0.3909', 'step': 4552, 'global_step': 15109} [2025-08-17 09:21:46] {'loss': '0.6110', 'loss_video': '0.2374', 'loss_audio': '0.3736', 'step': 4562, 'global_step': 15119} [2025-08-17 09:23:51] {'loss': '0.7038', 'loss_video': '0.2645', 'loss_audio': '0.4392', 'step': 4572, 'global_step': 15129} [2025-08-17 09:26:11] {'loss': '0.6621', 'loss_video': '0.2640', 'loss_audio': '0.3982', 'step': 4582, 'global_step': 15139} [2025-08-17 09:28:24] {'loss': '0.6425', 'loss_video': '0.2287', 'loss_audio': '0.4138', 'step': 4592, 'global_step': 15149} [2025-08-17 09:30:43] {'loss': '0.6476', 'loss_video': '0.2151', 'loss_audio': '0.4325', 'step': 4602, 'global_step': 15159} [2025-08-17 09:33:20] {'loss': '0.6427', 'loss_video': '0.2251', 'loss_audio': '0.4176', 'step': 4612, 'global_step': 15169} [2025-08-17 09:35:38] {'loss': '0.6950', 'loss_video': '0.2839', 'loss_audio': '0.4111', 'step': 4622, 'global_step': 15179} [2025-08-17 09:38:13] {'loss': '0.6331', 'loss_video': '0.2347', 'loss_audio': '0.3984', 'step': 4632, 'global_step': 15189} [2025-08-17 09:40:39] {'loss': '0.6307', 'loss_video': '0.2312', 'loss_audio': '0.3995', 'step': 4642, 'global_step': 15199} [2025-08-17 09:43:05] {'loss': '0.7302', 'loss_video': '0.2911', 'loss_audio': '0.4391', 'step': 4652, 'global_step': 15209} [2025-08-17 09:45:32] {'loss': '0.6874', 'loss_video': '0.2617', 'loss_audio': '0.4257', 'step': 4662, 'global_step': 15219} [2025-08-17 09:47:51] {'loss': '0.6623', 'loss_video': '0.2728', 'loss_audio': '0.3895', 'step': 4672, 'global_step': 15229} [2025-08-17 09:50:29] {'loss': '0.7015', 'loss_video': '0.2467', 'loss_audio': '0.4548', 'step': 4682, 'global_step': 15239} [2025-08-17 09:52:47] {'loss': '0.6851', 'loss_video': '0.2656', 'loss_audio': '0.4195', 'step': 4692, 'global_step': 15249} [2025-08-17 09:52:54] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 09:53:12] Saved checkpoint at epoch 1, step 4693, global_step 15250 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step15250 [2025-08-17 09:53:12] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step14750 has been deleted successfully as cfg.save_total_limit! [2025-08-17 09:55:26] {'loss': '0.6802', 'loss_video': '0.2645', 'loss_audio': '0.4157', 'step': 4702, 'global_step': 15259} [2025-08-17 09:58:01] {'loss': '0.6849', 'loss_video': '0.2860', 'loss_audio': '0.3989', 'step': 4712, 'global_step': 15269} [2025-08-17 10:00:36] {'loss': '0.6774', 'loss_video': '0.2609', 'loss_audio': '0.4165', 'step': 4722, 'global_step': 15279} [2025-08-17 10:02:55] {'loss': '0.6679', 'loss_video': '0.2766', 'loss_audio': '0.3914', 'step': 4732, 'global_step': 15289} [2025-08-17 10:05:03] {'loss': '0.6400', 'loss_video': '0.2357', 'loss_audio': '0.4042', 'step': 4742, 'global_step': 15299} [2025-08-17 10:07:34] {'loss': '0.6680', 'loss_video': '0.2431', 'loss_audio': '0.4249', 'step': 4752, 'global_step': 15309} [2025-08-17 10:09:44] {'loss': '0.6821', 'loss_video': '0.2567', 'loss_audio': '0.4254', 'step': 4762, 'global_step': 15319} [2025-08-17 10:12:08] {'loss': '0.6935', 'loss_video': '0.2647', 'loss_audio': '0.4288', 'step': 4772, 'global_step': 15329} [2025-08-17 10:14:33] {'loss': '0.6501', 'loss_video': '0.2649', 'loss_audio': '0.3852', 'step': 4782, 'global_step': 15339} [2025-08-17 10:17:17] {'loss': '0.6857', 'loss_video': '0.2551', 'loss_audio': '0.4306', 'step': 4792, 'global_step': 15349} [2025-08-17 10:19:40] {'loss': '0.6956', 'loss_video': '0.2493', 'loss_audio': '0.4463', 'step': 4802, 'global_step': 15359} [2025-08-17 10:22:05] {'loss': '0.6545', 'loss_video': '0.2497', 'loss_audio': '0.4048', 'step': 4812, 'global_step': 15369} [2025-08-17 10:24:32] {'loss': '0.6140', 'loss_video': '0.2352', 'loss_audio': '0.3787', 'step': 4822, 'global_step': 15379} [2025-08-17 10:27:18] {'loss': '0.7198', 'loss_video': '0.2727', 'loss_audio': '0.4472', 'step': 4832, 'global_step': 15389} [2025-08-17 10:29:42] {'loss': '0.7304', 'loss_video': '0.3150', 'loss_audio': '0.4154', 'step': 4842, 'global_step': 15399} [2025-08-17 10:32:28] {'loss': '0.6844', 'loss_video': '0.2859', 'loss_audio': '0.3985', 'step': 4852, 'global_step': 15409} [2025-08-17 10:34:59] {'loss': '0.6788', 'loss_video': '0.2556', 'loss_audio': '0.4232', 'step': 4862, 'global_step': 15419} [2025-08-17 10:37:29] {'loss': '0.6279', 'loss_video': '0.2274', 'loss_audio': '0.4006', 'step': 4872, 'global_step': 15429} [2025-08-17 10:40:05] {'loss': '0.6121', 'loss_video': '0.2285', 'loss_audio': '0.3836', 'step': 4882, 'global_step': 15439} [2025-08-17 10:42:49] {'loss': '0.7257', 'loss_video': '0.2501', 'loss_audio': '0.4755', 'step': 4892, 'global_step': 15449} [2025-08-17 10:45:03] {'loss': '0.6397', 'loss_video': '0.2357', 'loss_audio': '0.4040', 'step': 4902, 'global_step': 15459} [2025-08-17 10:47:45] {'loss': '0.7094', 'loss_video': '0.2517', 'loss_audio': '0.4577', 'step': 4912, 'global_step': 15469} [2025-08-17 10:49:57] {'loss': '0.6391', 'loss_video': '0.2324', 'loss_audio': '0.4067', 'step': 4922, 'global_step': 15479} [2025-08-17 10:52:10] {'loss': '0.7346', 'loss_video': '0.2791', 'loss_audio': '0.4555', 'step': 4932, 'global_step': 15489} [2025-08-17 10:54:43] {'loss': '0.7196', 'loss_video': '0.2718', 'loss_audio': '0.4478', 'step': 4942, 'global_step': 15499} [2025-08-17 10:54:50] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 10:55:10] Saved checkpoint at epoch 1, step 4943, global_step 15500 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step15500 [2025-08-17 10:55:10] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step15000 has been deleted successfully as cfg.save_total_limit! [2025-08-17 10:57:51] {'loss': '0.6856', 'loss_video': '0.2410', 'loss_audio': '0.4446', 'step': 4952, 'global_step': 15509} [2025-08-17 11:00:32] {'loss': '0.6980', 'loss_video': '0.2714', 'loss_audio': '0.4266', 'step': 4962, 'global_step': 15519} [2025-08-17 11:03:08] {'loss': '0.6724', 'loss_video': '0.2594', 'loss_audio': '0.4130', 'step': 4972, 'global_step': 15529} [2025-08-17 11:05:39] {'loss': '0.6941', 'loss_video': '0.2634', 'loss_audio': '0.4307', 'step': 4982, 'global_step': 15539} [2025-08-17 11:08:18] {'loss': '0.7200', 'loss_video': '0.2812', 'loss_audio': '0.4388', 'step': 4992, 'global_step': 15549} [2025-08-17 11:10:37] {'loss': '0.7748', 'loss_video': '0.2885', 'loss_audio': '0.4863', 'step': 5002, 'global_step': 15559} [2025-08-17 11:13:09] {'loss': '0.6157', 'loss_video': '0.2385', 'loss_audio': '0.3771', 'step': 5012, 'global_step': 15569} [2025-08-17 11:15:36] {'loss': '0.6720', 'loss_video': '0.2650', 'loss_audio': '0.4070', 'step': 5022, 'global_step': 15579} [2025-08-17 11:18:05] {'loss': '0.6862', 'loss_video': '0.2630', 'loss_audio': '0.4232', 'step': 5032, 'global_step': 15589} [2025-08-17 11:20:16] {'loss': '0.6771', 'loss_video': '0.2535', 'loss_audio': '0.4235', 'step': 5042, 'global_step': 15599} [2025-08-17 11:23:01] {'loss': '0.6706', 'loss_video': '0.2684', 'loss_audio': '0.4022', 'step': 5052, 'global_step': 15609} [2025-08-17 11:25:24] {'loss': '0.6687', 'loss_video': '0.2373', 'loss_audio': '0.4314', 'step': 5062, 'global_step': 15619} [2025-08-17 11:27:46] {'loss': '0.6872', 'loss_video': '0.2756', 'loss_audio': '0.4116', 'step': 5072, 'global_step': 15629} [2025-08-17 11:30:01] {'loss': '0.6198', 'loss_video': '0.2417', 'loss_audio': '0.3781', 'step': 5082, 'global_step': 15639} [2025-08-17 11:32:33] {'loss': '0.6708', 'loss_video': '0.2652', 'loss_audio': '0.4056', 'step': 5092, 'global_step': 15649} [2025-08-17 11:34:51] {'loss': '0.6307', 'loss_video': '0.2340', 'loss_audio': '0.3967', 'step': 5102, 'global_step': 15659} [2025-08-17 11:37:11] {'loss': '0.6808', 'loss_video': '0.2446', 'loss_audio': '0.4362', 'step': 5112, 'global_step': 15669} [2025-08-17 11:39:57] {'loss': '0.6641', 'loss_video': '0.2460', 'loss_audio': '0.4181', 'step': 5122, 'global_step': 15679} [2025-08-17 11:42:32] {'loss': '0.6859', 'loss_video': '0.2676', 'loss_audio': '0.4183', 'step': 5132, 'global_step': 15689} [2025-08-17 11:45:08] {'loss': '0.6407', 'loss_video': '0.2431', 'loss_audio': '0.3976', 'step': 5142, 'global_step': 15699} [2025-08-17 11:47:32] {'loss': '0.6883', 'loss_video': '0.2798', 'loss_audio': '0.4085', 'step': 5152, 'global_step': 15709} [2025-08-17 11:50:02] {'loss': '0.7120', 'loss_video': '0.3007', 'loss_audio': '0.4113', 'step': 5162, 'global_step': 15719} [2025-08-17 11:52:46] {'loss': '0.7294', 'loss_video': '0.2940', 'loss_audio': '0.4355', 'step': 5172, 'global_step': 15729} [2025-08-17 11:55:18] {'loss': '0.6605', 'loss_video': '0.2554', 'loss_audio': '0.4052', 'step': 5182, 'global_step': 15739} [2025-08-17 11:57:53] {'loss': '0.7450', 'loss_video': '0.3188', 'loss_audio': '0.4263', 'step': 5192, 'global_step': 15749} [2025-08-17 11:58:00] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 11:58:19] Saved checkpoint at epoch 1, step 5193, global_step 15750 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step15750 [2025-08-17 11:58:19] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step15250 has been deleted successfully as cfg.save_total_limit! [2025-08-17 12:00:56] {'loss': '0.6790', 'loss_video': '0.2569', 'loss_audio': '0.4222', 'step': 5202, 'global_step': 15759} [2025-08-17 12:03:27] {'loss': '0.6386', 'loss_video': '0.2659', 'loss_audio': '0.3727', 'step': 5212, 'global_step': 15769} [2025-08-17 12:05:54] {'loss': '0.7563', 'loss_video': '0.2962', 'loss_audio': '0.4601', 'step': 5222, 'global_step': 15779} [2025-08-17 12:08:14] {'loss': '0.7262', 'loss_video': '0.2741', 'loss_audio': '0.4521', 'step': 5232, 'global_step': 15789} [2025-08-17 12:10:45] {'loss': '0.6689', 'loss_video': '0.2634', 'loss_audio': '0.4056', 'step': 5242, 'global_step': 15799} [2025-08-17 12:12:59] {'loss': '0.6820', 'loss_video': '0.2495', 'loss_audio': '0.4324', 'step': 5252, 'global_step': 15809} [2025-08-17 12:15:37] {'loss': '0.6473', 'loss_video': '0.2369', 'loss_audio': '0.4104', 'step': 5262, 'global_step': 15819} [2025-08-17 12:17:59] {'loss': '0.6929', 'loss_video': '0.2464', 'loss_audio': '0.4466', 'step': 5272, 'global_step': 15829} [2025-08-17 12:20:37] {'loss': '0.6780', 'loss_video': '0.2693', 'loss_audio': '0.4087', 'step': 5282, 'global_step': 15839} [2025-08-17 12:23:22] {'loss': '0.6362', 'loss_video': '0.2608', 'loss_audio': '0.3754', 'step': 5292, 'global_step': 15849} [2025-08-17 12:25:36] {'loss': '0.7168', 'loss_video': '0.2654', 'loss_audio': '0.4514', 'step': 5302, 'global_step': 15859} [2025-08-17 12:27:56] {'loss': '0.6683', 'loss_video': '0.2492', 'loss_audio': '0.4191', 'step': 5312, 'global_step': 15869} [2025-08-17 12:30:09] {'loss': '0.6609', 'loss_video': '0.2666', 'loss_audio': '0.3943', 'step': 5322, 'global_step': 15879} [2025-08-17 12:32:46] {'loss': '0.6497', 'loss_video': '0.2446', 'loss_audio': '0.4051', 'step': 5332, 'global_step': 15889} [2025-08-17 12:35:16] {'loss': '0.6734', 'loss_video': '0.2785', 'loss_audio': '0.3949', 'step': 5342, 'global_step': 15899} [2025-08-17 12:37:44] {'loss': '0.7159', 'loss_video': '0.2856', 'loss_audio': '0.4303', 'step': 5352, 'global_step': 15909} [2025-08-17 12:40:15] {'loss': '0.6359', 'loss_video': '0.2400', 'loss_audio': '0.3958', 'step': 5362, 'global_step': 15919} [2025-08-17 12:42:42] {'loss': '0.6818', 'loss_video': '0.2783', 'loss_audio': '0.4035', 'step': 5372, 'global_step': 15929} [2025-08-17 12:45:10] {'loss': '0.7128', 'loss_video': '0.2802', 'loss_audio': '0.4326', 'step': 5382, 'global_step': 15939} [2025-08-17 12:47:38] {'loss': '0.6609', 'loss_video': '0.2662', 'loss_audio': '0.3948', 'step': 5392, 'global_step': 15949} [2025-08-17 12:50:11] {'loss': '0.6304', 'loss_video': '0.2255', 'loss_audio': '0.4049', 'step': 5402, 'global_step': 15959} [2025-08-17 12:52:43] {'loss': '0.6539', 'loss_video': '0.2463', 'loss_audio': '0.4076', 'step': 5412, 'global_step': 15969} [2025-08-17 12:55:12] {'loss': '0.6855', 'loss_video': '0.2676', 'loss_audio': '0.4178', 'step': 5422, 'global_step': 15979} [2025-08-17 12:57:48] {'loss': '0.6741', 'loss_video': '0.2547', 'loss_audio': '0.4194', 'step': 5432, 'global_step': 15989} [2025-08-17 12:59:52] {'loss': '0.6532', 'loss_video': '0.2494', 'loss_audio': '0.4038', 'step': 5442, 'global_step': 15999} [2025-08-17 12:59:58] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 13:00:17] Saved checkpoint at epoch 1, step 5443, global_step 16000 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step16000 [2025-08-17 13:00:17] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step15500 has been deleted successfully as cfg.save_total_limit! [2025-08-17 13:03:02] {'loss': '0.6182', 'loss_video': '0.2198', 'loss_audio': '0.3985', 'step': 5452, 'global_step': 16009} [2025-08-17 13:05:30] {'loss': '0.6952', 'loss_video': '0.2634', 'loss_audio': '0.4318', 'step': 5462, 'global_step': 16019} [2025-08-17 13:08:05] {'loss': '0.6859', 'loss_video': '0.2694', 'loss_audio': '0.4164', 'step': 5472, 'global_step': 16029} [2025-08-17 13:10:43] {'loss': '0.6891', 'loss_video': '0.2615', 'loss_audio': '0.4276', 'step': 5482, 'global_step': 16039} [2025-08-17 13:13:26] {'loss': '0.6723', 'loss_video': '0.2424', 'loss_audio': '0.4299', 'step': 5492, 'global_step': 16049} [2025-08-17 13:16:14] {'loss': '0.6387', 'loss_video': '0.2147', 'loss_audio': '0.4239', 'step': 5502, 'global_step': 16059} [2025-08-17 13:18:40] {'loss': '0.6558', 'loss_video': '0.2460', 'loss_audio': '0.4098', 'step': 5512, 'global_step': 16069} [2025-08-17 13:21:15] {'loss': '0.6888', 'loss_video': '0.2652', 'loss_audio': '0.4236', 'step': 5522, 'global_step': 16079} [2025-08-17 13:23:44] {'loss': '0.6887', 'loss_video': '0.2733', 'loss_audio': '0.4154', 'step': 5532, 'global_step': 16089} [2025-08-17 13:25:47] {'loss': '0.6396', 'loss_video': '0.2410', 'loss_audio': '0.3987', 'step': 5542, 'global_step': 16099} [2025-08-17 13:28:19] {'loss': '0.6332', 'loss_video': '0.2258', 'loss_audio': '0.4074', 'step': 5552, 'global_step': 16109} [2025-08-17 13:30:56] {'loss': '0.6496', 'loss_video': '0.2417', 'loss_audio': '0.4079', 'step': 5562, 'global_step': 16119} [2025-08-17 13:32:54] {'loss': '0.6467', 'loss_video': '0.2583', 'loss_audio': '0.3884', 'step': 5572, 'global_step': 16129} [2025-08-17 13:35:25] {'loss': '0.6969', 'loss_video': '0.2564', 'loss_audio': '0.4405', 'step': 5582, 'global_step': 16139} [2025-08-17 13:37:58] {'loss': '0.6727', 'loss_video': '0.2693', 'loss_audio': '0.4034', 'step': 5592, 'global_step': 16149} [2025-08-17 13:40:25] {'loss': '0.7065', 'loss_video': '0.2805', 'loss_audio': '0.4260', 'step': 5602, 'global_step': 16159} [2025-08-17 13:42:51] {'loss': '0.6474', 'loss_video': '0.2370', 'loss_audio': '0.4103', 'step': 5612, 'global_step': 16169} [2025-08-17 13:45:25] {'loss': '0.7064', 'loss_video': '0.2739', 'loss_audio': '0.4325', 'step': 5622, 'global_step': 16179} [2025-08-17 13:48:03] {'loss': '0.7112', 'loss_video': '0.2900', 'loss_audio': '0.4212', 'step': 5632, 'global_step': 16189} [2025-08-17 13:50:32] {'loss': '0.6804', 'loss_video': '0.2659', 'loss_audio': '0.4145', 'step': 5642, 'global_step': 16199} [2025-08-17 13:52:55] {'loss': '0.6338', 'loss_video': '0.2465', 'loss_audio': '0.3873', 'step': 5652, 'global_step': 16209} [2025-08-17 13:55:33] {'loss': '0.6547', 'loss_video': '0.2200', 'loss_audio': '0.4347', 'step': 5662, 'global_step': 16219} [2025-08-17 13:57:45] {'loss': '0.7146', 'loss_video': '0.2953', 'loss_audio': '0.4193', 'step': 5672, 'global_step': 16229} [2025-08-17 14:00:02] {'loss': '0.6697', 'loss_video': '0.2788', 'loss_audio': '0.3909', 'step': 5682, 'global_step': 16239} [2025-08-17 14:02:06] {'loss': '0.6774', 'loss_video': '0.2630', 'loss_audio': '0.4144', 'step': 5692, 'global_step': 16249} [2025-08-17 14:02:13] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 14:02:32] Saved checkpoint at epoch 1, step 5693, global_step 16250 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step16250 [2025-08-17 14:02:32] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step15750 has been deleted successfully as cfg.save_total_limit! [2025-08-17 14:04:52] {'loss': '0.6413', 'loss_video': '0.2400', 'loss_audio': '0.4013', 'step': 5702, 'global_step': 16259} [2025-08-17 14:07:25] {'loss': '0.6363', 'loss_video': '0.2314', 'loss_audio': '0.4049', 'step': 5712, 'global_step': 16269} [2025-08-17 14:09:48] {'loss': '0.6574', 'loss_video': '0.2458', 'loss_audio': '0.4117', 'step': 5722, 'global_step': 16279} [2025-08-17 14:12:16] {'loss': '0.7068', 'loss_video': '0.2774', 'loss_audio': '0.4294', 'step': 5732, 'global_step': 16289} [2025-08-17 14:14:51] {'loss': '0.6777', 'loss_video': '0.2776', 'loss_audio': '0.4000', 'step': 5742, 'global_step': 16299} [2025-08-17 14:17:27] {'loss': '0.7704', 'loss_video': '0.3048', 'loss_audio': '0.4657', 'step': 5752, 'global_step': 16309} [2025-08-17 14:20:03] {'loss': '0.6764', 'loss_video': '0.2541', 'loss_audio': '0.4224', 'step': 5762, 'global_step': 16319} [2025-08-17 14:22:21] {'loss': '0.6137', 'loss_video': '0.2374', 'loss_audio': '0.3763', 'step': 5772, 'global_step': 16329} [2025-08-17 14:24:54] {'loss': '0.7006', 'loss_video': '0.2779', 'loss_audio': '0.4227', 'step': 5782, 'global_step': 16339} [2025-08-17 14:27:22] {'loss': '0.7327', 'loss_video': '0.3057', 'loss_audio': '0.4270', 'step': 5792, 'global_step': 16349} [2025-08-17 14:29:51] {'loss': '0.6801', 'loss_video': '0.2720', 'loss_audio': '0.4081', 'step': 5802, 'global_step': 16359} [2025-08-17 14:32:14] {'loss': '0.7115', 'loss_video': '0.2788', 'loss_audio': '0.4327', 'step': 5812, 'global_step': 16369} [2025-08-17 14:34:42] {'loss': '0.7348', 'loss_video': '0.3147', 'loss_audio': '0.4200', 'step': 5822, 'global_step': 16379} [2025-08-17 14:36:52] {'loss': '0.7066', 'loss_video': '0.2701', 'loss_audio': '0.4365', 'step': 5832, 'global_step': 16389} [2025-08-17 14:39:31] {'loss': '0.6310', 'loss_video': '0.2467', 'loss_audio': '0.3843', 'step': 5842, 'global_step': 16399} [2025-08-17 14:42:06] {'loss': '0.7237', 'loss_video': '0.2721', 'loss_audio': '0.4517', 'step': 5852, 'global_step': 16409} [2025-08-17 14:44:13] {'loss': '0.7143', 'loss_video': '0.2909', 'loss_audio': '0.4234', 'step': 5862, 'global_step': 16419} [2025-08-17 14:46:41] {'loss': '0.6954', 'loss_video': '0.2752', 'loss_audio': '0.4201', 'step': 5872, 'global_step': 16429} [2025-08-17 14:48:50] {'loss': '0.6060', 'loss_video': '0.2285', 'loss_audio': '0.3776', 'step': 5882, 'global_step': 16439} [2025-08-17 14:51:11] {'loss': '0.6924', 'loss_video': '0.2815', 'loss_audio': '0.4109', 'step': 5892, 'global_step': 16449} [2025-08-17 14:53:58] {'loss': '0.6805', 'loss_video': '0.2620', 'loss_audio': '0.4186', 'step': 5902, 'global_step': 16459} [2025-08-17 14:56:03] {'loss': '0.6330', 'loss_video': '0.2504', 'loss_audio': '0.3826', 'step': 5912, 'global_step': 16469} [2025-08-17 14:58:06] {'loss': '0.6428', 'loss_video': '0.2407', 'loss_audio': '0.4020', 'step': 5922, 'global_step': 16479} [2025-08-17 15:00:46] {'loss': '0.7146', 'loss_video': '0.2805', 'loss_audio': '0.4341', 'step': 5932, 'global_step': 16489} [2025-08-17 15:03:07] {'loss': '0.6644', 'loss_video': '0.2654', 'loss_audio': '0.3991', 'step': 5942, 'global_step': 16499} [2025-08-17 15:03:14] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 15:03:33] Saved checkpoint at epoch 1, step 5943, global_step 16500 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step16500 [2025-08-17 15:05:59] {'loss': '0.6602', 'loss_video': '0.2646', 'loss_audio': '0.3955', 'step': 5952, 'global_step': 16509} [2025-08-17 15:08:15] {'loss': '0.6266', 'loss_video': '0.2381', 'loss_audio': '0.3886', 'step': 5962, 'global_step': 16519} [2025-08-17 15:10:36] {'loss': '0.6647', 'loss_video': '0.2530', 'loss_audio': '0.4117', 'step': 5972, 'global_step': 16529} [2025-08-17 15:13:07] {'loss': '0.6681', 'loss_video': '0.2621', 'loss_audio': '0.4060', 'step': 5982, 'global_step': 16539} [2025-08-17 15:15:52] {'loss': '0.6774', 'loss_video': '0.2582', 'loss_audio': '0.4192', 'step': 5992, 'global_step': 16549} [2025-08-17 15:18:08] {'loss': '0.6425', 'loss_video': '0.2322', 'loss_audio': '0.4102', 'step': 6002, 'global_step': 16559} [2025-08-17 15:20:25] {'loss': '0.6524', 'loss_video': '0.2504', 'loss_audio': '0.4019', 'step': 6012, 'global_step': 16569} [2025-08-17 15:22:58] {'loss': '0.6758', 'loss_video': '0.2761', 'loss_audio': '0.3997', 'step': 6022, 'global_step': 16579} [2025-08-17 15:25:22] {'loss': '0.7501', 'loss_video': '0.3111', 'loss_audio': '0.4391', 'step': 6032, 'global_step': 16589} [2025-08-17 15:28:03] {'loss': '0.7048', 'loss_video': '0.2885', 'loss_audio': '0.4162', 'step': 6042, 'global_step': 16599} [2025-08-17 15:30:33] {'loss': '0.7109', 'loss_video': '0.2706', 'loss_audio': '0.4402', 'step': 6052, 'global_step': 16609} [2025-08-17 15:33:17] {'loss': '0.6621', 'loss_video': '0.2611', 'loss_audio': '0.4010', 'step': 6062, 'global_step': 16619} [2025-08-17 15:35:57] {'loss': '0.6863', 'loss_video': '0.2895', 'loss_audio': '0.3968', 'step': 6072, 'global_step': 16629} [2025-08-17 15:38:26] {'loss': '0.6643', 'loss_video': '0.2595', 'loss_audio': '0.4048', 'step': 6082, 'global_step': 16639} [2025-08-17 15:41:18] {'loss': '0.6967', 'loss_video': '0.2748', 'loss_audio': '0.4219', 'step': 6092, 'global_step': 16649} [2025-08-17 15:43:45] {'loss': '0.7380', 'loss_video': '0.2901', 'loss_audio': '0.4479', 'step': 6102, 'global_step': 16659} [2025-08-17 15:46:16] {'loss': '0.7054', 'loss_video': '0.2630', 'loss_audio': '0.4424', 'step': 6112, 'global_step': 16669} [2025-08-17 15:48:32] {'loss': '0.6538', 'loss_video': '0.2330', 'loss_audio': '0.4209', 'step': 6122, 'global_step': 16679} [2025-08-17 15:51:03] {'loss': '0.6464', 'loss_video': '0.2310', 'loss_audio': '0.4154', 'step': 6132, 'global_step': 16689} [2025-08-17 15:53:05] {'loss': '0.6736', 'loss_video': '0.2524', 'loss_audio': '0.4212', 'step': 6142, 'global_step': 16699} [2025-08-17 15:55:14] {'loss': '0.6538', 'loss_video': '0.2508', 'loss_audio': '0.4029', 'step': 6152, 'global_step': 16709} [2025-08-17 15:57:45] {'loss': '0.6300', 'loss_video': '0.2277', 'loss_audio': '0.4024', 'step': 6162, 'global_step': 16719} [2025-08-17 16:00:23] {'loss': '0.6789', 'loss_video': '0.2733', 'loss_audio': '0.4056', 'step': 6172, 'global_step': 16729} [2025-08-17 16:03:04] {'loss': '0.6773', 'loss_video': '0.2812', 'loss_audio': '0.3961', 'step': 6182, 'global_step': 16739} [2025-08-17 16:05:55] {'loss': '0.8212', 'loss_video': '0.3282', 'loss_audio': '0.4930', 'step': 6192, 'global_step': 16749} [2025-08-17 16:06:02] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 16:06:21] Saved checkpoint at epoch 1, step 6193, global_step 16750 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step16750 [2025-08-17 16:06:21] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step16250 has been deleted successfully as cfg.save_total_limit! [2025-08-17 16:08:42] {'loss': '0.6972', 'loss_video': '0.2862', 'loss_audio': '0.4110', 'step': 6202, 'global_step': 16759} [2025-08-17 16:11:28] {'loss': '0.6812', 'loss_video': '0.2889', 'loss_audio': '0.3923', 'step': 6212, 'global_step': 16769} [2025-08-17 16:14:12] {'loss': '0.6946', 'loss_video': '0.2790', 'loss_audio': '0.4157', 'step': 6222, 'global_step': 16779} [2025-08-17 16:16:37] {'loss': '0.6954', 'loss_video': '0.2756', 'loss_audio': '0.4197', 'step': 6232, 'global_step': 16789} [2025-08-17 16:19:02] {'loss': '0.6653', 'loss_video': '0.2696', 'loss_audio': '0.3957', 'step': 6242, 'global_step': 16799} [2025-08-17 16:21:27] {'loss': '0.7110', 'loss_video': '0.2705', 'loss_audio': '0.4405', 'step': 6252, 'global_step': 16809} [2025-08-17 16:24:04] {'loss': '0.7392', 'loss_video': '0.2871', 'loss_audio': '0.4521', 'step': 6262, 'global_step': 16819} [2025-08-17 16:26:41] {'loss': '0.7003', 'loss_video': '0.2616', 'loss_audio': '0.4387', 'step': 6272, 'global_step': 16829} [2025-08-17 16:29:05] {'loss': '0.7285', 'loss_video': '0.2779', 'loss_audio': '0.4506', 'step': 6282, 'global_step': 16839} [2025-08-17 16:31:39] {'loss': '0.7539', 'loss_video': '0.2849', 'loss_audio': '0.4690', 'step': 6292, 'global_step': 16849} [2025-08-17 16:34:22] {'loss': '0.6906', 'loss_video': '0.2638', 'loss_audio': '0.4268', 'step': 6302, 'global_step': 16859} [2025-08-17 16:36:50] {'loss': '0.6758', 'loss_video': '0.2936', 'loss_audio': '0.3822', 'step': 6312, 'global_step': 16869} [2025-08-17 16:39:16] {'loss': '0.6643', 'loss_video': '0.2464', 'loss_audio': '0.4179', 'step': 6322, 'global_step': 16879} [2025-08-17 16:41:40] {'loss': '0.6293', 'loss_video': '0.2313', 'loss_audio': '0.3980', 'step': 6332, 'global_step': 16889} [2025-08-17 16:43:57] {'loss': '0.6249', 'loss_video': '0.2069', 'loss_audio': '0.4180', 'step': 6342, 'global_step': 16899} [2025-08-17 16:46:29] {'loss': '0.6600', 'loss_video': '0.2565', 'loss_audio': '0.4035', 'step': 6352, 'global_step': 16909} [2025-08-17 16:49:04] {'loss': '0.6142', 'loss_video': '0.2220', 'loss_audio': '0.3923', 'step': 6362, 'global_step': 16919} [2025-08-17 16:51:25] {'loss': '0.6564', 'loss_video': '0.2561', 'loss_audio': '0.4003', 'step': 6372, 'global_step': 16929} [2025-08-17 16:53:47] {'loss': '0.6277', 'loss_video': '0.2373', 'loss_audio': '0.3904', 'step': 6382, 'global_step': 16939} [2025-08-17 16:56:08] {'loss': '0.6782', 'loss_video': '0.2598', 'loss_audio': '0.4183', 'step': 6392, 'global_step': 16949} [2025-08-17 16:58:49] {'loss': '0.6840', 'loss_video': '0.2662', 'loss_audio': '0.4178', 'step': 6402, 'global_step': 16959} [2025-08-17 17:01:34] {'loss': '0.6442', 'loss_video': '0.2513', 'loss_audio': '0.3929', 'step': 6412, 'global_step': 16969} [2025-08-17 17:03:58] {'loss': '0.6564', 'loss_video': '0.2653', 'loss_audio': '0.3912', 'step': 6422, 'global_step': 16979} [2025-08-17 17:06:32] {'loss': '0.7118', 'loss_video': '0.2567', 'loss_audio': '0.4551', 'step': 6432, 'global_step': 16989} [2025-08-17 17:08:47] {'loss': '0.6944', 'loss_video': '0.2762', 'loss_audio': '0.4182', 'step': 6442, 'global_step': 16999} [2025-08-17 17:08:53] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 17:09:11] Saved checkpoint at epoch 1, step 6443, global_step 17000 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step17000 [2025-08-17 17:09:11] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step16500 has been deleted successfully as cfg.save_total_limit! [2025-08-17 17:11:48] {'loss': '0.6609', 'loss_video': '0.2466', 'loss_audio': '0.4143', 'step': 6452, 'global_step': 17009} [2025-08-17 17:14:11] {'loss': '0.6730', 'loss_video': '0.2539', 'loss_audio': '0.4191', 'step': 6462, 'global_step': 17019} [2025-08-17 17:16:22] {'loss': '0.6762', 'loss_video': '0.2462', 'loss_audio': '0.4299', 'step': 6472, 'global_step': 17029} [2025-08-17 17:19:01] {'loss': '0.7120', 'loss_video': '0.2895', 'loss_audio': '0.4225', 'step': 6482, 'global_step': 17039} [2025-08-17 17:21:42] {'loss': '0.7182', 'loss_video': '0.2919', 'loss_audio': '0.4263', 'step': 6492, 'global_step': 17049} [2025-08-17 17:24:05] {'loss': '0.6539', 'loss_video': '0.2479', 'loss_audio': '0.4060', 'step': 6502, 'global_step': 17059} [2025-08-17 17:26:51] {'loss': '0.6975', 'loss_video': '0.2853', 'loss_audio': '0.4122', 'step': 6512, 'global_step': 17069} [2025-08-17 17:29:11] {'loss': '0.6482', 'loss_video': '0.2541', 'loss_audio': '0.3941', 'step': 6522, 'global_step': 17079} [2025-08-17 17:31:30] {'loss': '0.6300', 'loss_video': '0.2601', 'loss_audio': '0.3699', 'step': 6532, 'global_step': 17089} [2025-08-17 17:33:58] {'loss': '0.6714', 'loss_video': '0.2432', 'loss_audio': '0.4282', 'step': 6542, 'global_step': 17099} [2025-08-17 17:36:14] {'loss': '0.7382', 'loss_video': '0.2741', 'loss_audio': '0.4642', 'step': 6552, 'global_step': 17109} [2025-08-17 17:38:38] {'loss': '0.6796', 'loss_video': '0.2792', 'loss_audio': '0.4004', 'step': 6562, 'global_step': 17119} [2025-08-17 17:41:15] {'loss': '0.6189', 'loss_video': '0.2418', 'loss_audio': '0.3770', 'step': 6572, 'global_step': 17129} [2025-08-17 17:43:37] {'loss': '0.7018', 'loss_video': '0.2858', 'loss_audio': '0.4160', 'step': 6582, 'global_step': 17139} [2025-08-17 17:45:32] {'loss': '0.7182', 'loss_video': '0.2623', 'loss_audio': '0.4559', 'step': 6592, 'global_step': 17149} [2025-08-17 17:48:11] {'loss': '0.6581', 'loss_video': '0.2792', 'loss_audio': '0.3789', 'step': 6602, 'global_step': 17159} [2025-08-17 17:50:24] {'loss': '0.7016', 'loss_video': '0.2496', 'loss_audio': '0.4520', 'step': 6612, 'global_step': 17169} [2025-08-17 17:52:49] {'loss': '0.7183', 'loss_video': '0.2825', 'loss_audio': '0.4358', 'step': 6622, 'global_step': 17179} [2025-08-17 17:55:11] {'loss': '0.6817', 'loss_video': '0.2414', 'loss_audio': '0.4403', 'step': 6632, 'global_step': 17189} [2025-08-17 17:57:35] {'loss': '0.6730', 'loss_video': '0.2548', 'loss_audio': '0.4182', 'step': 6642, 'global_step': 17199} [2025-08-17 17:59:47] {'loss': '0.6784', 'loss_video': '0.2541', 'loss_audio': '0.4244', 'step': 6652, 'global_step': 17209} [2025-08-17 18:02:12] {'loss': '0.6741', 'loss_video': '0.2542', 'loss_audio': '0.4199', 'step': 6662, 'global_step': 17219} [2025-08-17 18:04:40] {'loss': '0.7422', 'loss_video': '0.2841', 'loss_audio': '0.4581', 'step': 6672, 'global_step': 17229} [2025-08-17 18:07:04] {'loss': '0.7397', 'loss_video': '0.2844', 'loss_audio': '0.4553', 'step': 6682, 'global_step': 17239} [2025-08-17 18:09:40] {'loss': '0.7254', 'loss_video': '0.2827', 'loss_audio': '0.4427', 'step': 6692, 'global_step': 17249} [2025-08-17 18:09:46] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 18:10:04] Saved checkpoint at epoch 1, step 6693, global_step 17250 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step17250 [2025-08-17 18:10:05] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step16750 has been deleted successfully as cfg.save_total_limit! [2025-08-17 18:12:39] {'loss': '0.7245', 'loss_video': '0.2937', 'loss_audio': '0.4308', 'step': 6702, 'global_step': 17259} [2025-08-17 18:15:06] {'loss': '0.6637', 'loss_video': '0.2631', 'loss_audio': '0.4007', 'step': 6712, 'global_step': 17269} [2025-08-17 18:17:36] {'loss': '0.6806', 'loss_video': '0.2592', 'loss_audio': '0.4214', 'step': 6722, 'global_step': 17279} [2025-08-17 18:20:19] {'loss': '0.6554', 'loss_video': '0.2611', 'loss_audio': '0.3944', 'step': 6732, 'global_step': 17289} [2025-08-17 18:23:00] {'loss': '0.6833', 'loss_video': '0.2652', 'loss_audio': '0.4181', 'step': 6742, 'global_step': 17299} [2025-08-17 18:25:32] {'loss': '0.7318', 'loss_video': '0.3100', 'loss_audio': '0.4218', 'step': 6752, 'global_step': 17309} [2025-08-17 18:27:59] {'loss': '0.6553', 'loss_video': '0.2644', 'loss_audio': '0.3910', 'step': 6762, 'global_step': 17319} [2025-08-17 18:30:37] {'loss': '0.6361', 'loss_video': '0.2384', 'loss_audio': '0.3977', 'step': 6772, 'global_step': 17329} [2025-08-17 18:33:25] {'loss': '0.6688', 'loss_video': '0.2307', 'loss_audio': '0.4381', 'step': 6782, 'global_step': 17339} [2025-08-17 18:35:49] {'loss': '0.6680', 'loss_video': '0.2649', 'loss_audio': '0.4031', 'step': 6792, 'global_step': 17349} [2025-08-17 18:38:24] {'loss': '0.6900', 'loss_video': '0.2664', 'loss_audio': '0.4236', 'step': 6802, 'global_step': 17359} [2025-08-17 18:40:35] {'loss': '0.6971', 'loss_video': '0.2937', 'loss_audio': '0.4035', 'step': 6812, 'global_step': 17369} [2025-08-17 18:43:10] {'loss': '0.6875', 'loss_video': '0.2864', 'loss_audio': '0.4011', 'step': 6822, 'global_step': 17379} [2025-08-17 18:45:26] {'loss': '0.6393', 'loss_video': '0.2449', 'loss_audio': '0.3944', 'step': 6832, 'global_step': 17389} [2025-08-17 18:47:32] {'loss': '0.7052', 'loss_video': '0.2694', 'loss_audio': '0.4358', 'step': 6842, 'global_step': 17399} [2025-08-17 18:50:20] {'loss': '0.6697', 'loss_video': '0.2494', 'loss_audio': '0.4203', 'step': 6852, 'global_step': 17409} [2025-08-17 18:52:18] {'loss': '0.6387', 'loss_video': '0.2358', 'loss_audio': '0.4029', 'step': 6862, 'global_step': 17419} [2025-08-17 18:54:59] {'loss': '0.6752', 'loss_video': '0.2473', 'loss_audio': '0.4278', 'step': 6872, 'global_step': 17429} [2025-08-17 18:57:31] {'loss': '0.6426', 'loss_video': '0.2597', 'loss_audio': '0.3830', 'step': 6882, 'global_step': 17439} [2025-08-17 18:59:58] {'loss': '0.7295', 'loss_video': '0.3131', 'loss_audio': '0.4164', 'step': 6892, 'global_step': 17449} [2025-08-17 19:02:44] {'loss': '0.6500', 'loss_video': '0.2536', 'loss_audio': '0.3964', 'step': 6902, 'global_step': 17459} [2025-08-17 19:05:15] {'loss': '0.7131', 'loss_video': '0.2893', 'loss_audio': '0.4238', 'step': 6912, 'global_step': 17469} [2025-08-17 19:07:40] {'loss': '0.7241', 'loss_video': '0.2680', 'loss_audio': '0.4561', 'step': 6922, 'global_step': 17479} [2025-08-17 19:10:13] {'loss': '0.5846', 'loss_video': '0.2115', 'loss_audio': '0.3731', 'step': 6932, 'global_step': 17489} [2025-08-17 19:12:50] {'loss': '0.6639', 'loss_video': '0.2565', 'loss_audio': '0.4074', 'step': 6942, 'global_step': 17499} [2025-08-17 19:12:57] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 19:13:16] Saved checkpoint at epoch 1, step 6943, global_step 17500 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step17500 [2025-08-17 19:13:16] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step17000 has been deleted successfully as cfg.save_total_limit! [2025-08-17 19:15:26] {'loss': '0.6376', 'loss_video': '0.2332', 'loss_audio': '0.4045', 'step': 6952, 'global_step': 17509} [2025-08-17 19:18:00] {'loss': '0.7071', 'loss_video': '0.2953', 'loss_audio': '0.4118', 'step': 6962, 'global_step': 17519} [2025-08-17 19:20:38] {'loss': '0.6919', 'loss_video': '0.2563', 'loss_audio': '0.4356', 'step': 6972, 'global_step': 17529} [2025-08-17 19:22:54] {'loss': '0.6443', 'loss_video': '0.2592', 'loss_audio': '0.3851', 'step': 6982, 'global_step': 17539} [2025-08-17 19:25:30] {'loss': '0.6723', 'loss_video': '0.2782', 'loss_audio': '0.3941', 'step': 6992, 'global_step': 17549} [2025-08-17 19:27:49] {'loss': '0.6855', 'loss_video': '0.2668', 'loss_audio': '0.4186', 'step': 7002, 'global_step': 17559} [2025-08-17 19:29:09] Number of bands (47) exceeds limit (43). [2025-08-17 19:29:58] {'loss': '0.6529', 'loss_video': '0.2488', 'loss_audio': '0.4041', 'step': 7012, 'global_step': 17569} [2025-08-17 19:32:22] {'loss': '0.7139', 'loss_video': '0.2925', 'loss_audio': '0.4215', 'step': 7022, 'global_step': 17579} [2025-08-17 19:34:37] {'loss': '0.6504', 'loss_video': '0.2300', 'loss_audio': '0.4204', 'step': 7032, 'global_step': 17589} [2025-08-17 19:36:47] {'loss': '0.6790', 'loss_video': '0.2736', 'loss_audio': '0.4055', 'step': 7042, 'global_step': 17599} [2025-08-17 19:39:06] {'loss': '0.6973', 'loss_video': '0.2640', 'loss_audio': '0.4333', 'step': 7052, 'global_step': 17609} [2025-08-17 19:41:26] {'loss': '0.6585', 'loss_video': '0.2427', 'loss_audio': '0.4158', 'step': 7062, 'global_step': 17619} [2025-08-17 19:44:12] {'loss': '0.6288', 'loss_video': '0.2489', 'loss_audio': '0.3799', 'step': 7072, 'global_step': 17629} [2025-08-17 19:46:20] {'loss': '0.6605', 'loss_video': '0.2417', 'loss_audio': '0.4188', 'step': 7082, 'global_step': 17639} [2025-08-17 19:48:49] {'loss': '0.6300', 'loss_video': '0.2361', 'loss_audio': '0.3939', 'step': 7092, 'global_step': 17649} [2025-08-17 19:51:20] {'loss': '0.6401', 'loss_video': '0.2641', 'loss_audio': '0.3760', 'step': 7102, 'global_step': 17659} [2025-08-17 19:54:07] {'loss': '0.6524', 'loss_video': '0.2592', 'loss_audio': '0.3932', 'step': 7112, 'global_step': 17669} [2025-08-17 19:56:42] {'loss': '0.6379', 'loss_video': '0.2471', 'loss_audio': '0.3908', 'step': 7122, 'global_step': 17679} [2025-08-17 19:58:44] {'loss': '0.7169', 'loss_video': '0.2686', 'loss_audio': '0.4482', 'step': 7132, 'global_step': 17689} [2025-08-17 20:01:23] {'loss': '0.6570', 'loss_video': '0.2573', 'loss_audio': '0.3997', 'step': 7142, 'global_step': 17699} [2025-08-17 20:03:56] {'loss': '0.6142', 'loss_video': '0.2198', 'loss_audio': '0.3944', 'step': 7152, 'global_step': 17709} [2025-08-17 20:06:21] {'loss': '0.6984', 'loss_video': '0.2769', 'loss_audio': '0.4215', 'step': 7162, 'global_step': 17719} [2025-08-17 20:08:52] {'loss': '0.6185', 'loss_video': '0.2398', 'loss_audio': '0.3787', 'step': 7172, 'global_step': 17729} [2025-08-17 20:11:38] {'loss': '0.6749', 'loss_video': '0.2881', 'loss_audio': '0.3868', 'step': 7182, 'global_step': 17739} [2025-08-17 20:13:59] {'loss': '0.6325', 'loss_video': '0.2297', 'loss_audio': '0.4028', 'step': 7192, 'global_step': 17749} [2025-08-17 20:14:06] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 20:14:24] Saved checkpoint at epoch 1, step 7193, global_step 17750 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step17750 [2025-08-17 20:14:24] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step17250 has been deleted successfully as cfg.save_total_limit! [2025-08-17 20:16:55] {'loss': '0.7158', 'loss_video': '0.3095', 'loss_audio': '0.4063', 'step': 7202, 'global_step': 17759} [2025-08-17 20:19:32] {'loss': '0.6838', 'loss_video': '0.2573', 'loss_audio': '0.4265', 'step': 7212, 'global_step': 17769} [2025-08-17 20:22:02] {'loss': '0.6837', 'loss_video': '0.2552', 'loss_audio': '0.4285', 'step': 7222, 'global_step': 17779} [2025-08-17 20:24:52] {'loss': '0.6930', 'loss_video': '0.2657', 'loss_audio': '0.4272', 'step': 7232, 'global_step': 17789} [2025-08-17 20:27:17] {'loss': '0.6922', 'loss_video': '0.2580', 'loss_audio': '0.4342', 'step': 7242, 'global_step': 17799} [2025-08-17 20:29:32] {'loss': '0.6926', 'loss_video': '0.2778', 'loss_audio': '0.4148', 'step': 7252, 'global_step': 17809} [2025-08-17 20:32:11] {'loss': '0.6794', 'loss_video': '0.2711', 'loss_audio': '0.4083', 'step': 7262, 'global_step': 17819} [2025-08-17 20:34:54] {'loss': '0.6313', 'loss_video': '0.2371', 'loss_audio': '0.3942', 'step': 7272, 'global_step': 17829} [2025-08-17 20:37:17] {'loss': '0.6805', 'loss_video': '0.2514', 'loss_audio': '0.4291', 'step': 7282, 'global_step': 17839} [2025-08-17 20:39:58] {'loss': '0.6654', 'loss_video': '0.2820', 'loss_audio': '0.3834', 'step': 7292, 'global_step': 17849} [2025-08-17 20:42:21] {'loss': '0.6565', 'loss_video': '0.2362', 'loss_audio': '0.4203', 'step': 7302, 'global_step': 17859} [2025-08-17 20:44:58] {'loss': '0.7260', 'loss_video': '0.2786', 'loss_audio': '0.4475', 'step': 7312, 'global_step': 17869} [2025-08-17 20:47:15] {'loss': '0.6610', 'loss_video': '0.2465', 'loss_audio': '0.4145', 'step': 7322, 'global_step': 17879} [2025-08-17 20:49:53] {'loss': '0.6561', 'loss_video': '0.2614', 'loss_audio': '0.3948', 'step': 7332, 'global_step': 17889} [2025-08-17 20:52:23] {'loss': '0.7021', 'loss_video': '0.2542', 'loss_audio': '0.4479', 'step': 7342, 'global_step': 17899} [2025-08-17 20:54:56] {'loss': '0.6918', 'loss_video': '0.2584', 'loss_audio': '0.4334', 'step': 7352, 'global_step': 17909} [2025-08-17 20:57:31] {'loss': '0.6741', 'loss_video': '0.2839', 'loss_audio': '0.3902', 'step': 7362, 'global_step': 17919} [2025-08-17 21:00:11] {'loss': '0.7000', 'loss_video': '0.2776', 'loss_audio': '0.4225', 'step': 7372, 'global_step': 17929} [2025-08-17 21:02:26] {'loss': '0.6443', 'loss_video': '0.2525', 'loss_audio': '0.3919', 'step': 7382, 'global_step': 17939} [2025-08-17 21:04:54] {'loss': '0.7390', 'loss_video': '0.2843', 'loss_audio': '0.4547', 'step': 7392, 'global_step': 17949} [2025-08-17 21:07:28] {'loss': '0.7342', 'loss_video': '0.2781', 'loss_audio': '0.4561', 'step': 7402, 'global_step': 17959} [2025-08-17 21:09:50] {'loss': '0.6482', 'loss_video': '0.2617', 'loss_audio': '0.3865', 'step': 7412, 'global_step': 17969} [2025-08-17 21:12:19] {'loss': '0.6853', 'loss_video': '0.2733', 'loss_audio': '0.4119', 'step': 7422, 'global_step': 17979} [2025-08-17 21:14:28] {'loss': '0.6432', 'loss_video': '0.2402', 'loss_audio': '0.4030', 'step': 7432, 'global_step': 17989} [2025-08-17 21:17:26] {'loss': '0.7623', 'loss_video': '0.2765', 'loss_audio': '0.4858', 'step': 7442, 'global_step': 17999} [2025-08-17 21:17:32] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 21:17:50] Saved checkpoint at epoch 1, step 7443, global_step 18000 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step18000 [2025-08-17 21:17:51] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step17500 has been deleted successfully as cfg.save_total_limit! [2025-08-17 21:20:12] {'loss': '0.6699', 'loss_video': '0.2677', 'loss_audio': '0.4022', 'step': 7452, 'global_step': 18009} [2025-08-17 21:22:24] {'loss': '0.6800', 'loss_video': '0.2699', 'loss_audio': '0.4100', 'step': 7462, 'global_step': 18019} [2025-08-17 21:24:49] {'loss': '0.6215', 'loss_video': '0.2339', 'loss_audio': '0.3876', 'step': 7472, 'global_step': 18029} [2025-08-17 21:27:40] {'loss': '0.6732', 'loss_video': '0.2526', 'loss_audio': '0.4205', 'step': 7482, 'global_step': 18039} [2025-08-17 21:30:06] {'loss': '0.6475', 'loss_video': '0.2330', 'loss_audio': '0.4145', 'step': 7492, 'global_step': 18049} [2025-08-17 21:32:31] {'loss': '0.6724', 'loss_video': '0.2417', 'loss_audio': '0.4307', 'step': 7502, 'global_step': 18059} [2025-08-17 21:34:54] {'loss': '0.6809', 'loss_video': '0.2621', 'loss_audio': '0.4188', 'step': 7512, 'global_step': 18069} [2025-08-17 21:37:20] {'loss': '0.6910', 'loss_video': '0.2653', 'loss_audio': '0.4257', 'step': 7522, 'global_step': 18079} [2025-08-17 21:39:59] {'loss': '0.7259', 'loss_video': '0.3089', 'loss_audio': '0.4170', 'step': 7532, 'global_step': 18089} [2025-08-17 21:42:36] {'loss': '0.6927', 'loss_video': '0.2717', 'loss_audio': '0.4210', 'step': 7542, 'global_step': 18099} [2025-08-17 21:45:05] {'loss': '0.7138', 'loss_video': '0.2804', 'loss_audio': '0.4334', 'step': 7552, 'global_step': 18109} [2025-08-17 21:47:09] {'loss': '0.6579', 'loss_video': '0.2455', 'loss_audio': '0.4124', 'step': 7562, 'global_step': 18119} [2025-08-17 21:49:47] {'loss': '0.7047', 'loss_video': '0.2588', 'loss_audio': '0.4459', 'step': 7572, 'global_step': 18129} [2025-08-17 21:52:17] {'loss': '0.6362', 'loss_video': '0.2549', 'loss_audio': '0.3813', 'step': 7582, 'global_step': 18139} [2025-08-17 21:54:36] {'loss': '0.6987', 'loss_video': '0.2616', 'loss_audio': '0.4371', 'step': 7592, 'global_step': 18149} [2025-08-17 21:57:05] {'loss': '0.6772', 'loss_video': '0.2663', 'loss_audio': '0.4109', 'step': 7602, 'global_step': 18159} [2025-08-17 21:59:40] {'loss': '0.6932', 'loss_video': '0.2493', 'loss_audio': '0.4439', 'step': 7612, 'global_step': 18169} [2025-08-17 22:01:56] {'loss': '0.6632', 'loss_video': '0.2565', 'loss_audio': '0.4067', 'step': 7622, 'global_step': 18179} [2025-08-17 22:04:39] {'loss': '0.6748', 'loss_video': '0.2730', 'loss_audio': '0.4018', 'step': 7632, 'global_step': 18189} [2025-08-17 22:06:53] {'loss': '0.6955', 'loss_video': '0.2700', 'loss_audio': '0.4254', 'step': 7642, 'global_step': 18199} [2025-08-17 22:09:07] {'loss': '0.6506', 'loss_video': '0.2487', 'loss_audio': '0.4019', 'step': 7652, 'global_step': 18209} [2025-08-17 22:11:27] {'loss': '0.6670', 'loss_video': '0.2732', 'loss_audio': '0.3938', 'step': 7662, 'global_step': 18219} [2025-08-17 22:13:41] {'loss': '0.6804', 'loss_video': '0.2704', 'loss_audio': '0.4100', 'step': 7672, 'global_step': 18229} [2025-08-17 22:15:54] {'loss': '0.7204', 'loss_video': '0.2648', 'loss_audio': '0.4556', 'step': 7682, 'global_step': 18239} [2025-08-17 22:18:32] {'loss': '0.6843', 'loss_video': '0.2613', 'loss_audio': '0.4229', 'step': 7692, 'global_step': 18249} [2025-08-17 22:18:39] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 22:18:57] Saved checkpoint at epoch 1, step 7693, global_step 18250 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step18250 [2025-08-17 22:18:57] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step17750 has been deleted successfully as cfg.save_total_limit! [2025-08-17 22:21:17] {'loss': '0.6226', 'loss_video': '0.2348', 'loss_audio': '0.3878', 'step': 7702, 'global_step': 18259} [2025-08-17 22:23:34] {'loss': '0.6750', 'loss_video': '0.2359', 'loss_audio': '0.4391', 'step': 7712, 'global_step': 18269} [2025-08-17 22:25:57] {'loss': '0.6567', 'loss_video': '0.2635', 'loss_audio': '0.3932', 'step': 7722, 'global_step': 18279} [2025-08-17 22:28:30] {'loss': '0.7123', 'loss_video': '0.2600', 'loss_audio': '0.4523', 'step': 7732, 'global_step': 18289} [2025-08-17 22:31:16] {'loss': '0.6475', 'loss_video': '0.2617', 'loss_audio': '0.3858', 'step': 7742, 'global_step': 18299} [2025-08-17 22:33:42] {'loss': '0.6619', 'loss_video': '0.2592', 'loss_audio': '0.4026', 'step': 7752, 'global_step': 18309} [2025-08-17 22:36:12] {'loss': '0.6517', 'loss_video': '0.2721', 'loss_audio': '0.3796', 'step': 7762, 'global_step': 18319} [2025-08-17 22:39:02] {'loss': '0.6484', 'loss_video': '0.2417', 'loss_audio': '0.4067', 'step': 7772, 'global_step': 18329} [2025-08-17 22:41:21] {'loss': '0.6700', 'loss_video': '0.2424', 'loss_audio': '0.4276', 'step': 7782, 'global_step': 18339} [2025-08-17 22:43:44] {'loss': '0.7107', 'loss_video': '0.2827', 'loss_audio': '0.4280', 'step': 7792, 'global_step': 18349} [2025-08-17 22:45:59] {'loss': '0.6409', 'loss_video': '0.2366', 'loss_audio': '0.4043', 'step': 7802, 'global_step': 18359} [2025-08-17 22:48:37] {'loss': '0.6537', 'loss_video': '0.2559', 'loss_audio': '0.3978', 'step': 7812, 'global_step': 18369} [2025-08-17 22:51:02] {'loss': '0.6940', 'loss_video': '0.2691', 'loss_audio': '0.4249', 'step': 7822, 'global_step': 18379} [2025-08-17 22:53:09] {'loss': '0.6086', 'loss_video': '0.2420', 'loss_audio': '0.3666', 'step': 7832, 'global_step': 18389} [2025-08-17 22:55:45] {'loss': '0.6757', 'loss_video': '0.2727', 'loss_audio': '0.4029', 'step': 7842, 'global_step': 18399} [2025-08-17 22:58:15] {'loss': '0.6374', 'loss_video': '0.2511', 'loss_audio': '0.3863', 'step': 7852, 'global_step': 18409} [2025-08-17 23:00:33] {'loss': '0.6351', 'loss_video': '0.2235', 'loss_audio': '0.4116', 'step': 7862, 'global_step': 18419} [2025-08-17 23:02:59] {'loss': '0.6713', 'loss_video': '0.2449', 'loss_audio': '0.4264', 'step': 7872, 'global_step': 18429} [2025-08-17 23:05:30] {'loss': '0.7131', 'loss_video': '0.2738', 'loss_audio': '0.4392', 'step': 7882, 'global_step': 18439} [2025-08-17 23:08:00] {'loss': '0.6525', 'loss_video': '0.2581', 'loss_audio': '0.3944', 'step': 7892, 'global_step': 18449} [2025-08-17 23:10:46] {'loss': '0.7150', 'loss_video': '0.2870', 'loss_audio': '0.4280', 'step': 7902, 'global_step': 18459} [2025-08-17 23:13:04] {'loss': '0.6552', 'loss_video': '0.2532', 'loss_audio': '0.4021', 'step': 7912, 'global_step': 18469} [2025-08-17 23:15:47] {'loss': '0.6508', 'loss_video': '0.2722', 'loss_audio': '0.3785', 'step': 7922, 'global_step': 18479} [2025-08-17 23:18:29] {'loss': '0.6193', 'loss_video': '0.2506', 'loss_audio': '0.3687', 'step': 7932, 'global_step': 18489} [2025-08-17 23:21:13] {'loss': '0.6958', 'loss_video': '0.2638', 'loss_audio': '0.4320', 'step': 7942, 'global_step': 18499} [2025-08-17 23:21:20] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-17 23:21:37] Saved checkpoint at epoch 1, step 7943, global_step 18500 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step18500 [2025-08-17 23:21:37] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step18000 has been deleted successfully as cfg.save_total_limit! [2025-08-17 23:24:00] {'loss': '0.6840', 'loss_video': '0.2435', 'loss_audio': '0.4404', 'step': 7952, 'global_step': 18509} [2025-08-17 23:26:23] {'loss': '0.6479', 'loss_video': '0.2526', 'loss_audio': '0.3953', 'step': 7962, 'global_step': 18519} [2025-08-17 23:28:41] {'loss': '0.7066', 'loss_video': '0.2810', 'loss_audio': '0.4255', 'step': 7972, 'global_step': 18529} [2025-08-17 23:31:19] {'loss': '0.6235', 'loss_video': '0.2465', 'loss_audio': '0.3770', 'step': 7982, 'global_step': 18539} [2025-08-17 23:33:48] {'loss': '0.6867', 'loss_video': '0.2792', 'loss_audio': '0.4075', 'step': 7992, 'global_step': 18549} [2025-08-17 23:36:10] {'loss': '0.5891', 'loss_video': '0.2198', 'loss_audio': '0.3693', 'step': 8002, 'global_step': 18559} [2025-08-17 23:38:39] {'loss': '0.6908', 'loss_video': '0.2713', 'loss_audio': '0.4195', 'step': 8012, 'global_step': 18569} [2025-08-17 23:41:03] {'loss': '0.7575', 'loss_video': '0.3030', 'loss_audio': '0.4545', 'step': 8022, 'global_step': 18579} [2025-08-17 23:43:53] {'loss': '0.6548', 'loss_video': '0.2702', 'loss_audio': '0.3846', 'step': 8032, 'global_step': 18589} [2025-08-17 23:46:28] {'loss': '0.7287', 'loss_video': '0.2965', 'loss_audio': '0.4322', 'step': 8042, 'global_step': 18599} [2025-08-17 23:49:07] {'loss': '0.7119', 'loss_video': '0.2803', 'loss_audio': '0.4317', 'step': 8052, 'global_step': 18609} [2025-08-17 23:51:32] {'loss': '0.6971', 'loss_video': '0.2603', 'loss_audio': '0.4367', 'step': 8062, 'global_step': 18619} [2025-08-17 23:53:56] {'loss': '0.6725', 'loss_video': '0.2548', 'loss_audio': '0.4177', 'step': 8072, 'global_step': 18629} [2025-08-17 23:56:12] {'loss': '0.6824', 'loss_video': '0.2667', 'loss_audio': '0.4157', 'step': 8082, 'global_step': 18639} [2025-08-17 23:58:41] {'loss': '0.6232', 'loss_video': '0.2399', 'loss_audio': '0.3833', 'step': 8092, 'global_step': 18649} [2025-08-18 00:01:07] {'loss': '0.6916', 'loss_video': '0.2668', 'loss_audio': '0.4249', 'step': 8102, 'global_step': 18659} [2025-08-18 00:03:45] {'loss': '0.6405', 'loss_video': '0.2770', 'loss_audio': '0.3635', 'step': 8112, 'global_step': 18669} [2025-08-18 00:06:20] {'loss': '0.6712', 'loss_video': '0.2788', 'loss_audio': '0.3925', 'step': 8122, 'global_step': 18679} [2025-08-18 00:08:59] {'loss': '0.6186', 'loss_video': '0.2294', 'loss_audio': '0.3892', 'step': 8132, 'global_step': 18689} [2025-08-18 00:11:36] {'loss': '0.6737', 'loss_video': '0.2673', 'loss_audio': '0.4063', 'step': 8142, 'global_step': 18699} [2025-08-18 00:13:36] {'loss': '0.6736', 'loss_video': '0.2682', 'loss_audio': '0.4053', 'step': 8152, 'global_step': 18709} [2025-08-18 00:16:03] {'loss': '0.6827', 'loss_video': '0.2611', 'loss_audio': '0.4216', 'step': 8162, 'global_step': 18719} [2025-08-18 00:18:39] {'loss': '0.7690', 'loss_video': '0.3243', 'loss_audio': '0.4448', 'step': 8172, 'global_step': 18729} [2025-08-18 00:21:26] {'loss': '0.7554', 'loss_video': '0.2706', 'loss_audio': '0.4848', 'step': 8182, 'global_step': 18739} [2025-08-18 00:23:56] {'loss': '0.7338', 'loss_video': '0.2753', 'loss_audio': '0.4585', 'step': 8192, 'global_step': 18749} [2025-08-18 00:24:02] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-18 00:24:20] Saved checkpoint at epoch 1, step 8193, global_step 18750 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step18750 [2025-08-18 00:24:20] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step18250 has been deleted successfully as cfg.save_total_limit! [2025-08-18 00:26:54] {'loss': '0.6943', 'loss_video': '0.2594', 'loss_audio': '0.4349', 'step': 8202, 'global_step': 18759} [2025-08-18 00:29:17] {'loss': '0.6703', 'loss_video': '0.2513', 'loss_audio': '0.4190', 'step': 8212, 'global_step': 18769} [2025-08-18 00:31:45] {'loss': '0.6299', 'loss_video': '0.2490', 'loss_audio': '0.3809', 'step': 8222, 'global_step': 18779} [2025-08-18 00:34:06] {'loss': '0.6462', 'loss_video': '0.2746', 'loss_audio': '0.3716', 'step': 8232, 'global_step': 18789} [2025-08-18 00:36:41] {'loss': '0.7157', 'loss_video': '0.2670', 'loss_audio': '0.4487', 'step': 8242, 'global_step': 18799} [2025-08-18 00:39:04] {'loss': '0.6201', 'loss_video': '0.2516', 'loss_audio': '0.3685', 'step': 8252, 'global_step': 18809} [2025-08-18 00:41:18] {'loss': '0.6705', 'loss_video': '0.2565', 'loss_audio': '0.4141', 'step': 8262, 'global_step': 18819} [2025-08-18 00:44:01] {'loss': '0.7490', 'loss_video': '0.2683', 'loss_audio': '0.4807', 'step': 8272, 'global_step': 18829} [2025-08-18 00:46:52] {'loss': '0.7166', 'loss_video': '0.3140', 'loss_audio': '0.4026', 'step': 8282, 'global_step': 18839} [2025-08-18 00:49:23] {'loss': '0.6599', 'loss_video': '0.2617', 'loss_audio': '0.3982', 'step': 8292, 'global_step': 18849} [2025-08-18 00:51:48] {'loss': '0.6003', 'loss_video': '0.2275', 'loss_audio': '0.3728', 'step': 8302, 'global_step': 18859} [2025-08-18 00:54:15] {'loss': '0.6661', 'loss_video': '0.2633', 'loss_audio': '0.4028', 'step': 8312, 'global_step': 18869} [2025-08-18 00:56:49] {'loss': '0.6367', 'loss_video': '0.2553', 'loss_audio': '0.3814', 'step': 8322, 'global_step': 18879} [2025-08-18 00:59:17] {'loss': '0.7056', 'loss_video': '0.2751', 'loss_audio': '0.4305', 'step': 8332, 'global_step': 18889} [2025-08-18 01:02:05] {'loss': '0.6141', 'loss_video': '0.2307', 'loss_audio': '0.3835', 'step': 8342, 'global_step': 18899} [2025-08-18 01:04:31] {'loss': '0.6443', 'loss_video': '0.2546', 'loss_audio': '0.3897', 'step': 8352, 'global_step': 18909} [2025-08-18 01:07:07] {'loss': '0.6932', 'loss_video': '0.2716', 'loss_audio': '0.4216', 'step': 8362, 'global_step': 18919} [2025-08-18 01:09:28] {'loss': '0.7060', 'loss_video': '0.2601', 'loss_audio': '0.4459', 'step': 8372, 'global_step': 18929} [2025-08-18 01:12:00] {'loss': '0.6545', 'loss_video': '0.2626', 'loss_audio': '0.3919', 'step': 8382, 'global_step': 18939} [2025-08-18 01:14:27] {'loss': '0.7187', 'loss_video': '0.2808', 'loss_audio': '0.4379', 'step': 8392, 'global_step': 18949} [2025-08-18 01:17:06] {'loss': '0.6677', 'loss_video': '0.2543', 'loss_audio': '0.4135', 'step': 8402, 'global_step': 18959} [2025-08-18 01:19:31] {'loss': '0.6394', 'loss_video': '0.2503', 'loss_audio': '0.3890', 'step': 8412, 'global_step': 18969} [2025-08-18 01:22:14] {'loss': '0.7532', 'loss_video': '0.2769', 'loss_audio': '0.4764', 'step': 8422, 'global_step': 18979} [2025-08-18 01:24:39] {'loss': '0.6527', 'loss_video': '0.2658', 'loss_audio': '0.3868', 'step': 8432, 'global_step': 18989} [2025-08-18 01:27:17] {'loss': '0.6861', 'loss_video': '0.2556', 'loss_audio': '0.4305', 'step': 8442, 'global_step': 18999} [2025-08-18 01:27:23] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-18 01:27:41] Saved checkpoint at epoch 1, step 8443, global_step 19000 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step19000 [2025-08-18 01:27:41] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step18500 has been deleted successfully as cfg.save_total_limit! [2025-08-18 01:30:18] {'loss': '0.6532', 'loss_video': '0.2602', 'loss_audio': '0.3930', 'step': 8452, 'global_step': 19009} [2025-08-18 01:32:36] {'loss': '0.7276', 'loss_video': '0.2750', 'loss_audio': '0.4527', 'step': 8462, 'global_step': 19019} [2025-08-18 01:35:10] {'loss': '0.6864', 'loss_video': '0.2392', 'loss_audio': '0.4472', 'step': 8472, 'global_step': 19029} [2025-08-18 01:37:33] {'loss': '0.6487', 'loss_video': '0.2268', 'loss_audio': '0.4218', 'step': 8482, 'global_step': 19039} [2025-08-18 01:39:55] {'loss': '0.6275', 'loss_video': '0.2554', 'loss_audio': '0.3721', 'step': 8492, 'global_step': 19049} [2025-08-18 01:42:14] {'loss': '0.6556', 'loss_video': '0.2713', 'loss_audio': '0.3843', 'step': 8502, 'global_step': 19059} [2025-08-18 01:44:34] {'loss': '0.6883', 'loss_video': '0.2767', 'loss_audio': '0.4116', 'step': 8512, 'global_step': 19069} [2025-08-18 01:47:00] {'loss': '0.7101', 'loss_video': '0.2943', 'loss_audio': '0.4158', 'step': 8522, 'global_step': 19079} [2025-08-18 01:49:33] {'loss': '0.6546', 'loss_video': '0.2498', 'loss_audio': '0.4048', 'step': 8532, 'global_step': 19089} [2025-08-18 01:51:57] {'loss': '0.6938', 'loss_video': '0.2613', 'loss_audio': '0.4325', 'step': 8542, 'global_step': 19099} [2025-08-18 01:54:33] {'loss': '0.6511', 'loss_video': '0.2437', 'loss_audio': '0.4074', 'step': 8552, 'global_step': 19109} [2025-08-18 01:56:49] {'loss': '0.7303', 'loss_video': '0.2671', 'loss_audio': '0.4633', 'step': 8562, 'global_step': 19119} [2025-08-18 01:59:20] {'loss': '0.5871', 'loss_video': '0.2162', 'loss_audio': '0.3709', 'step': 8572, 'global_step': 19129} [2025-08-18 02:01:38] {'loss': '0.6690', 'loss_video': '0.2572', 'loss_audio': '0.4119', 'step': 8582, 'global_step': 19139} [2025-08-18 02:04:03] {'loss': '0.7412', 'loss_video': '0.3085', 'loss_audio': '0.4326', 'step': 8592, 'global_step': 19149} [2025-08-18 02:06:33] {'loss': '0.7072', 'loss_video': '0.2649', 'loss_audio': '0.4423', 'step': 8602, 'global_step': 19159} [2025-08-18 02:09:16] {'loss': '0.6577', 'loss_video': '0.2578', 'loss_audio': '0.3999', 'step': 8612, 'global_step': 19169} [2025-08-18 02:11:56] {'loss': '0.7296', 'loss_video': '0.2707', 'loss_audio': '0.4589', 'step': 8622, 'global_step': 19179} [2025-08-18 02:14:48] {'loss': '0.6767', 'loss_video': '0.2783', 'loss_audio': '0.3984', 'step': 8632, 'global_step': 19189} [2025-08-18 02:17:05] {'loss': '0.6854', 'loss_video': '0.3096', 'loss_audio': '0.3758', 'step': 8642, 'global_step': 19199} [2025-08-18 02:19:33] {'loss': '0.6252', 'loss_video': '0.2454', 'loss_audio': '0.3798', 'step': 8652, 'global_step': 19209} [2025-08-18 02:22:05] {'loss': '0.6397', 'loss_video': '0.2588', 'loss_audio': '0.3809', 'step': 8662, 'global_step': 19219} [2025-08-18 02:24:14] {'loss': '0.6888', 'loss_video': '0.2906', 'loss_audio': '0.3982', 'step': 8672, 'global_step': 19229} [2025-08-18 02:26:42] {'loss': '0.6576', 'loss_video': '0.2348', 'loss_audio': '0.4228', 'step': 8682, 'global_step': 19239} [2025-08-18 02:29:06] {'loss': '0.5902', 'loss_video': '0.2247', 'loss_audio': '0.3655', 'step': 8692, 'global_step': 19249} [2025-08-18 02:29:13] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-18 02:29:30] Saved checkpoint at epoch 1, step 8693, global_step 19250 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step19250 [2025-08-18 02:29:30] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step18750 has been deleted successfully as cfg.save_total_limit! [2025-08-18 02:31:57] {'loss': '0.6566', 'loss_video': '0.2610', 'loss_audio': '0.3956', 'step': 8702, 'global_step': 19259} [2025-08-18 02:34:43] {'loss': '0.7071', 'loss_video': '0.2820', 'loss_audio': '0.4251', 'step': 8712, 'global_step': 19269} [2025-08-18 02:37:16] {'loss': '0.6759', 'loss_video': '0.2784', 'loss_audio': '0.3975', 'step': 8722, 'global_step': 19279} [2025-08-18 02:39:07] {'loss': '0.6679', 'loss_video': '0.2227', 'loss_audio': '0.4452', 'step': 8732, 'global_step': 19289} [2025-08-18 02:41:36] {'loss': '0.6566', 'loss_video': '0.2554', 'loss_audio': '0.4011', 'step': 8742, 'global_step': 19299} [2025-08-18 02:44:19] {'loss': '0.6654', 'loss_video': '0.2711', 'loss_audio': '0.3943', 'step': 8752, 'global_step': 19309} [2025-08-18 02:46:42] {'loss': '0.6770', 'loss_video': '0.2730', 'loss_audio': '0.4040', 'step': 8762, 'global_step': 19319} [2025-08-18 02:49:01] {'loss': '0.6307', 'loss_video': '0.2308', 'loss_audio': '0.3999', 'step': 8772, 'global_step': 19329} [2025-08-18 02:51:15] {'loss': '0.6639', 'loss_video': '0.2597', 'loss_audio': '0.4041', 'step': 8782, 'global_step': 19339} [2025-08-18 02:53:25] {'loss': '0.6665', 'loss_video': '0.2575', 'loss_audio': '0.4090', 'step': 8792, 'global_step': 19349} [2025-08-18 02:56:06] {'loss': '0.6187', 'loss_video': '0.2398', 'loss_audio': '0.3789', 'step': 8802, 'global_step': 19359} [2025-08-18 02:58:22] {'loss': '0.7153', 'loss_video': '0.2556', 'loss_audio': '0.4597', 'step': 8812, 'global_step': 19369} [2025-08-18 03:00:52] {'loss': '0.6697', 'loss_video': '0.2482', 'loss_audio': '0.4215', 'step': 8822, 'global_step': 19379} [2025-08-18 03:03:44] {'loss': '0.7084', 'loss_video': '0.2783', 'loss_audio': '0.4300', 'step': 8832, 'global_step': 19389} [2025-08-18 03:06:05] {'loss': '0.7184', 'loss_video': '0.3045', 'loss_audio': '0.4139', 'step': 8842, 'global_step': 19399} [2025-08-18 03:08:42] {'loss': '0.7074', 'loss_video': '0.2881', 'loss_audio': '0.4194', 'step': 8852, 'global_step': 19409} [2025-08-18 03:11:13] {'loss': '0.6738', 'loss_video': '0.2640', 'loss_audio': '0.4098', 'step': 8862, 'global_step': 19419} [2025-08-18 03:13:38] {'loss': '0.6861', 'loss_video': '0.2555', 'loss_audio': '0.4306', 'step': 8872, 'global_step': 19429} [2025-08-18 03:16:10] {'loss': '0.7389', 'loss_video': '0.2654', 'loss_audio': '0.4735', 'step': 8882, 'global_step': 19439} [2025-08-18 03:18:43] {'loss': '0.6579', 'loss_video': '0.2461', 'loss_audio': '0.4118', 'step': 8892, 'global_step': 19449} [2025-08-18 03:21:10] {'loss': '0.6926', 'loss_video': '0.2758', 'loss_audio': '0.4169', 'step': 8902, 'global_step': 19459} [2025-08-18 03:23:06] {'loss': '0.6783', 'loss_video': '0.2615', 'loss_audio': '0.4168', 'step': 8912, 'global_step': 19469} [2025-08-18 03:25:42] {'loss': '0.6461', 'loss_video': '0.2514', 'loss_audio': '0.3947', 'step': 8922, 'global_step': 19479} [2025-08-18 03:28:11] {'loss': '0.6884', 'loss_video': '0.2642', 'loss_audio': '0.4241', 'step': 8932, 'global_step': 19489} [2025-08-18 03:30:52] {'loss': '0.6295', 'loss_video': '0.2368', 'loss_audio': '0.3927', 'step': 8942, 'global_step': 19499} [2025-08-18 03:30:59] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-18 03:31:17] Saved checkpoint at epoch 1, step 8943, global_step 19500 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step19500 [2025-08-18 03:31:17] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step19000 has been deleted successfully as cfg.save_total_limit! [2025-08-18 03:33:32] {'loss': '0.6634', 'loss_video': '0.2330', 'loss_audio': '0.4303', 'step': 8952, 'global_step': 19509} [2025-08-18 03:36:12] {'loss': '0.7185', 'loss_video': '0.2487', 'loss_audio': '0.4698', 'step': 8962, 'global_step': 19519} [2025-08-18 03:38:35] {'loss': '0.6252', 'loss_video': '0.2312', 'loss_audio': '0.3940', 'step': 8972, 'global_step': 19529} [2025-08-18 03:41:06] {'loss': '0.6312', 'loss_video': '0.2354', 'loss_audio': '0.3958', 'step': 8982, 'global_step': 19539} [2025-08-18 03:43:48] {'loss': '0.6921', 'loss_video': '0.2886', 'loss_audio': '0.4035', 'step': 8992, 'global_step': 19549} [2025-08-18 03:45:59] {'loss': '0.7053', 'loss_video': '0.2471', 'loss_audio': '0.4582', 'step': 9002, 'global_step': 19559} [2025-08-18 03:48:29] {'loss': '0.6906', 'loss_video': '0.2860', 'loss_audio': '0.4046', 'step': 9012, 'global_step': 19569} [2025-08-18 03:50:47] {'loss': '0.6901', 'loss_video': '0.2576', 'loss_audio': '0.4325', 'step': 9022, 'global_step': 19579} [2025-08-18 03:53:15] {'loss': '0.6696', 'loss_video': '0.2609', 'loss_audio': '0.4088', 'step': 9032, 'global_step': 19589} [2025-08-18 03:55:33] {'loss': '0.6197', 'loss_video': '0.2357', 'loss_audio': '0.3840', 'step': 9042, 'global_step': 19599} [2025-08-18 03:57:58] {'loss': '0.6994', 'loss_video': '0.2303', 'loss_audio': '0.4691', 'step': 9052, 'global_step': 19609} [2025-08-18 04:00:26] {'loss': '0.6441', 'loss_video': '0.2500', 'loss_audio': '0.3941', 'step': 9062, 'global_step': 19619} [2025-08-18 04:03:02] {'loss': '0.7070', 'loss_video': '0.2686', 'loss_audio': '0.4384', 'step': 9072, 'global_step': 19629} [2025-08-18 04:05:29] {'loss': '0.6829', 'loss_video': '0.2680', 'loss_audio': '0.4150', 'step': 9082, 'global_step': 19639} [2025-08-18 04:08:01] {'loss': '0.7643', 'loss_video': '0.3052', 'loss_audio': '0.4591', 'step': 9092, 'global_step': 19649} [2025-08-18 04:10:26] {'loss': '0.6872', 'loss_video': '0.2568', 'loss_audio': '0.4304', 'step': 9102, 'global_step': 19659} [2025-08-18 04:12:48] {'loss': '0.6515', 'loss_video': '0.2638', 'loss_audio': '0.3877', 'step': 9112, 'global_step': 19669} [2025-08-18 04:15:13] {'loss': '0.6380', 'loss_video': '0.2450', 'loss_audio': '0.3930', 'step': 9122, 'global_step': 19679} [2025-08-18 04:17:41] {'loss': '0.6417', 'loss_video': '0.2415', 'loss_audio': '0.4002', 'step': 9132, 'global_step': 19689} [2025-08-18 04:20:10] {'loss': '0.6926', 'loss_video': '0.2811', 'loss_audio': '0.4115', 'step': 9142, 'global_step': 19699} [2025-08-18 04:22:41] {'loss': '0.6666', 'loss_video': '0.2519', 'loss_audio': '0.4147', 'step': 9152, 'global_step': 19709} [2025-08-18 04:25:07] {'loss': '0.6992', 'loss_video': '0.2863', 'loss_audio': '0.4129', 'step': 9162, 'global_step': 19719} [2025-08-18 04:27:18] {'loss': '0.6343', 'loss_video': '0.2424', 'loss_audio': '0.3919', 'step': 9172, 'global_step': 19729} [2025-08-18 04:29:59] {'loss': '0.6295', 'loss_video': '0.2471', 'loss_audio': '0.3823', 'step': 9182, 'global_step': 19739} [2025-08-18 04:32:33] {'loss': '0.6257', 'loss_video': '0.2518', 'loss_audio': '0.3739', 'step': 9192, 'global_step': 19749} [2025-08-18 04:32:39] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-18 04:32:57] Saved checkpoint at epoch 1, step 9193, global_step 19750 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step19750 [2025-08-18 04:32:58] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step19250 has been deleted successfully as cfg.save_total_limit! [2025-08-18 04:35:14] {'loss': '0.6644', 'loss_video': '0.2302', 'loss_audio': '0.4342', 'step': 9202, 'global_step': 19759} [2025-08-18 04:37:44] {'loss': '0.7164', 'loss_video': '0.3031', 'loss_audio': '0.4134', 'step': 9212, 'global_step': 19769} [2025-08-18 04:39:46] {'loss': '0.6739', 'loss_video': '0.2616', 'loss_audio': '0.4123', 'step': 9222, 'global_step': 19779} [2025-08-18 04:42:28] {'loss': '0.7086', 'loss_video': '0.3004', 'loss_audio': '0.4082', 'step': 9232, 'global_step': 19789} [2025-08-18 04:45:21] {'loss': '0.7055', 'loss_video': '0.2866', 'loss_audio': '0.4189', 'step': 9242, 'global_step': 19799} [2025-08-18 04:47:55] {'loss': '0.6598', 'loss_video': '0.2636', 'loss_audio': '0.3963', 'step': 9252, 'global_step': 19809} [2025-08-18 04:50:02] {'loss': '0.6654', 'loss_video': '0.2665', 'loss_audio': '0.3989', 'step': 9262, 'global_step': 19819} [2025-08-18 04:52:29] {'loss': '0.6509', 'loss_video': '0.2647', 'loss_audio': '0.3862', 'step': 9272, 'global_step': 19829} [2025-08-18 04:54:55] {'loss': '0.6808', 'loss_video': '0.2724', 'loss_audio': '0.4084', 'step': 9282, 'global_step': 19839} [2025-08-18 04:57:36] {'loss': '0.6640', 'loss_video': '0.2418', 'loss_audio': '0.4221', 'step': 9292, 'global_step': 19849} [2025-08-18 04:59:57] {'loss': '0.6410', 'loss_video': '0.2474', 'loss_audio': '0.3936', 'step': 9302, 'global_step': 19859} [2025-08-18 05:02:16] {'loss': '0.6772', 'loss_video': '0.2482', 'loss_audio': '0.4290', 'step': 9312, 'global_step': 19869} [2025-08-18 05:04:43] {'loss': '0.6781', 'loss_video': '0.2509', 'loss_audio': '0.4271', 'step': 9322, 'global_step': 19879} [2025-08-18 05:07:08] {'loss': '0.6512', 'loss_video': '0.2458', 'loss_audio': '0.4054', 'step': 9332, 'global_step': 19889} [2025-08-18 05:09:45] {'loss': '0.6718', 'loss_video': '0.2822', 'loss_audio': '0.3895', 'step': 9342, 'global_step': 19899} [2025-08-18 05:12:14] {'loss': '0.7565', 'loss_video': '0.2878', 'loss_audio': '0.4687', 'step': 9352, 'global_step': 19909} [2025-08-18 05:15:06] {'loss': '0.6801', 'loss_video': '0.2720', 'loss_audio': '0.4081', 'step': 9362, 'global_step': 19919} [2025-08-18 05:17:24] {'loss': '0.7134', 'loss_video': '0.2582', 'loss_audio': '0.4552', 'step': 9372, 'global_step': 19929} [2025-08-18 05:19:42] {'loss': '0.7022', 'loss_video': '0.2733', 'loss_audio': '0.4289', 'step': 9382, 'global_step': 19939} [2025-08-18 05:22:39] {'loss': '0.6990', 'loss_video': '0.2585', 'loss_audio': '0.4405', 'step': 9392, 'global_step': 19949} [2025-08-18 05:25:07] {'loss': '0.7116', 'loss_video': '0.2604', 'loss_audio': '0.4512', 'step': 9402, 'global_step': 19959} [2025-08-18 05:27:37] {'loss': '0.6298', 'loss_video': '0.2529', 'loss_audio': '0.3769', 'step': 9412, 'global_step': 19969} [2025-08-18 05:30:12] {'loss': '0.7058', 'loss_video': '0.2777', 'loss_audio': '0.4281', 'step': 9422, 'global_step': 19979} [2025-08-18 05:32:43] {'loss': '0.7203', 'loss_video': '0.2665', 'loss_audio': '0.4538', 'step': 9432, 'global_step': 19989} [2025-08-18 05:35:14] {'loss': '0.6832', 'loss_video': '0.2564', 'loss_audio': '0.4267', 'step': 9442, 'global_step': 19999} [2025-08-18 05:35:21] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-18 05:35:39] Saved checkpoint at epoch 1, step 9443, global_step 20000 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step20000 [2025-08-18 05:35:39] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step19500 has been deleted successfully as cfg.save_total_limit! [2025-08-18 05:38:06] {'loss': '0.6298', 'loss_video': '0.2260', 'loss_audio': '0.4039', 'step': 9452, 'global_step': 20009} [2025-08-18 05:40:25] {'loss': '0.7114', 'loss_video': '0.2672', 'loss_audio': '0.4442', 'step': 9462, 'global_step': 20019} [2025-08-18 05:42:46] {'loss': '0.6287', 'loss_video': '0.2337', 'loss_audio': '0.3951', 'step': 9472, 'global_step': 20029} [2025-08-18 05:45:01] {'loss': '0.6550', 'loss_video': '0.2265', 'loss_audio': '0.4285', 'step': 9482, 'global_step': 20039} [2025-08-18 05:47:20] {'loss': '0.6791', 'loss_video': '0.2517', 'loss_audio': '0.4275', 'step': 9492, 'global_step': 20049} [2025-08-18 05:49:44] {'loss': '0.6800', 'loss_video': '0.2405', 'loss_audio': '0.4395', 'step': 9502, 'global_step': 20059} [2025-08-18 05:52:11] {'loss': '0.7056', 'loss_video': '0.2743', 'loss_audio': '0.4313', 'step': 9512, 'global_step': 20069} [2025-08-18 05:54:52] {'loss': '0.6916', 'loss_video': '0.2607', 'loss_audio': '0.4309', 'step': 9522, 'global_step': 20079} [2025-08-18 05:57:06] {'loss': '0.6492', 'loss_video': '0.2674', 'loss_audio': '0.3819', 'step': 9532, 'global_step': 20089} [2025-08-18 05:59:28] {'loss': '0.6900', 'loss_video': '0.2693', 'loss_audio': '0.4206', 'step': 9542, 'global_step': 20099} [2025-08-18 06:01:50] {'loss': '0.6977', 'loss_video': '0.2832', 'loss_audio': '0.4145', 'step': 9552, 'global_step': 20109} [2025-08-18 06:04:31] {'loss': '0.7056', 'loss_video': '0.2910', 'loss_audio': '0.4146', 'step': 9562, 'global_step': 20119} [2025-08-18 06:06:59] {'loss': '0.7322', 'loss_video': '0.2875', 'loss_audio': '0.4448', 'step': 9572, 'global_step': 20129} [2025-08-18 06:09:25] {'loss': '0.6241', 'loss_video': '0.2538', 'loss_audio': '0.3703', 'step': 9582, 'global_step': 20139} [2025-08-18 06:11:52] {'loss': '0.6941', 'loss_video': '0.2675', 'loss_audio': '0.4265', 'step': 9592, 'global_step': 20149} [2025-08-18 06:14:35] {'loss': '0.6699', 'loss_video': '0.2653', 'loss_audio': '0.4046', 'step': 9602, 'global_step': 20159} [2025-08-18 06:16:48] {'loss': '0.6702', 'loss_video': '0.2558', 'loss_audio': '0.4144', 'step': 9612, 'global_step': 20169} [2025-08-18 06:18:55] {'loss': '0.6834', 'loss_video': '0.2668', 'loss_audio': '0.4166', 'step': 9622, 'global_step': 20179} [2025-08-18 06:21:26] {'loss': '0.6915', 'loss_video': '0.2834', 'loss_audio': '0.4081', 'step': 9632, 'global_step': 20189} [2025-08-18 06:24:08] {'loss': '0.7073', 'loss_video': '0.2543', 'loss_audio': '0.4530', 'step': 9642, 'global_step': 20199} [2025-08-18 06:26:31] {'loss': '0.6729', 'loss_video': '0.2815', 'loss_audio': '0.3914', 'step': 9652, 'global_step': 20209} [2025-08-18 06:28:57] {'loss': '0.6595', 'loss_video': '0.2608', 'loss_audio': '0.3988', 'step': 9662, 'global_step': 20219} [2025-08-18 06:31:32] {'loss': '0.6580', 'loss_video': '0.2621', 'loss_audio': '0.3959', 'step': 9672, 'global_step': 20229} [2025-08-18 06:34:05] {'loss': '0.7074', 'loss_video': '0.2680', 'loss_audio': '0.4393', 'step': 9682, 'global_step': 20239} [2025-08-18 06:36:41] {'loss': '0.6710', 'loss_video': '0.2512', 'loss_audio': '0.4198', 'step': 9692, 'global_step': 20249} [2025-08-18 06:36:47] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-18 06:37:04] Saved checkpoint at epoch 1, step 9693, global_step 20250 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step20250 [2025-08-18 06:37:04] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step19750 has been deleted successfully as cfg.save_total_limit! [2025-08-18 06:39:26] {'loss': '0.6048', 'loss_video': '0.2304', 'loss_audio': '0.3745', 'step': 9702, 'global_step': 20259} [2025-08-18 06:41:55] {'loss': '0.6787', 'loss_video': '0.2543', 'loss_audio': '0.4244', 'step': 9712, 'global_step': 20269} [2025-08-18 06:44:19] {'loss': '0.6142', 'loss_video': '0.2421', 'loss_audio': '0.3721', 'step': 9722, 'global_step': 20279} [2025-08-18 06:47:00] {'loss': '0.6734', 'loss_video': '0.2631', 'loss_audio': '0.4103', 'step': 9732, 'global_step': 20289} [2025-08-18 06:49:15] {'loss': '0.6883', 'loss_video': '0.2545', 'loss_audio': '0.4338', 'step': 9742, 'global_step': 20299} [2025-08-18 06:51:27] {'loss': '0.6606', 'loss_video': '0.2548', 'loss_audio': '0.4058', 'step': 9752, 'global_step': 20309} [2025-08-18 06:54:02] {'loss': '0.6733', 'loss_video': '0.2700', 'loss_audio': '0.4033', 'step': 9762, 'global_step': 20319} [2025-08-18 06:56:34] {'loss': '0.7154', 'loss_video': '0.2917', 'loss_audio': '0.4237', 'step': 9772, 'global_step': 20329} [2025-08-18 06:58:46] {'loss': '0.6521', 'loss_video': '0.2463', 'loss_audio': '0.4058', 'step': 9782, 'global_step': 20339} [2025-08-18 07:01:41] {'loss': '0.7210', 'loss_video': '0.2968', 'loss_audio': '0.4242', 'step': 9792, 'global_step': 20349} [2025-08-18 07:04:06] {'loss': '0.6628', 'loss_video': '0.2312', 'loss_audio': '0.4316', 'step': 9802, 'global_step': 20359} [2025-08-18 07:06:14] {'loss': '0.6535', 'loss_video': '0.2337', 'loss_audio': '0.4198', 'step': 9812, 'global_step': 20369} [2025-08-18 07:08:36] {'loss': '0.6927', 'loss_video': '0.2716', 'loss_audio': '0.4211', 'step': 9822, 'global_step': 20379} [2025-08-18 07:11:09] {'loss': '0.6492', 'loss_video': '0.2502', 'loss_audio': '0.3990', 'step': 9832, 'global_step': 20389} [2025-08-18 07:13:32] {'loss': '0.6726', 'loss_video': '0.2508', 'loss_audio': '0.4218', 'step': 9842, 'global_step': 20399} [2025-08-18 07:16:00] {'loss': '0.6673', 'loss_video': '0.2802', 'loss_audio': '0.3871', 'step': 9852, 'global_step': 20409} [2025-08-18 07:18:27] {'loss': '0.6652', 'loss_video': '0.2638', 'loss_audio': '0.4014', 'step': 9862, 'global_step': 20419} [2025-08-18 07:20:49] {'loss': '0.6615', 'loss_video': '0.2462', 'loss_audio': '0.4153', 'step': 9872, 'global_step': 20429} [2025-08-18 07:23:21] {'loss': '0.6617', 'loss_video': '0.2446', 'loss_audio': '0.4171', 'step': 9882, 'global_step': 20439} [2025-08-18 07:25:34] {'loss': '0.7406', 'loss_video': '0.2882', 'loss_audio': '0.4524', 'step': 9892, 'global_step': 20449} [2025-08-18 07:28:03] {'loss': '0.6856', 'loss_video': '0.2609', 'loss_audio': '0.4247', 'step': 9902, 'global_step': 20459} [2025-08-18 07:30:48] {'loss': '0.6660', 'loss_video': '0.2730', 'loss_audio': '0.3929', 'step': 9912, 'global_step': 20469} [2025-08-18 07:33:35] {'loss': '0.6637', 'loss_video': '0.2932', 'loss_audio': '0.3706', 'step': 9922, 'global_step': 20479} [2025-08-18 07:36:10] {'loss': '0.6958', 'loss_video': '0.2725', 'loss_audio': '0.4233', 'step': 9932, 'global_step': 20489} [2025-08-18 07:38:45] {'loss': '0.6601', 'loss_video': '0.2676', 'loss_audio': '0.3924', 'step': 9942, 'global_step': 20499} [2025-08-18 07:38:51] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-18 07:39:10] Saved checkpoint at epoch 1, step 9943, global_step 20500 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step20500 [2025-08-18 07:39:11] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step20000 has been deleted successfully as cfg.save_total_limit! [2025-08-18 07:41:33] {'loss': '0.6880', 'loss_video': '0.2679', 'loss_audio': '0.4201', 'step': 9952, 'global_step': 20509} [2025-08-18 07:43:59] {'loss': '0.6686', 'loss_video': '0.2584', 'loss_audio': '0.4102', 'step': 9962, 'global_step': 20519} [2025-08-18 07:46:21] {'loss': '0.7281', 'loss_video': '0.2804', 'loss_audio': '0.4477', 'step': 9972, 'global_step': 20529} [2025-08-18 07:48:49] {'loss': '0.6885', 'loss_video': '0.2737', 'loss_audio': '0.4147', 'step': 9982, 'global_step': 20539} [2025-08-18 07:51:10] {'loss': '0.6905', 'loss_video': '0.2848', 'loss_audio': '0.4057', 'step': 9992, 'global_step': 20549} [2025-08-18 07:53:48] {'loss': '0.6695', 'loss_video': '0.2590', 'loss_audio': '0.4105', 'step': 10002, 'global_step': 20559} [2025-08-18 07:56:25] {'loss': '0.6868', 'loss_video': '0.2569', 'loss_audio': '0.4299', 'step': 10012, 'global_step': 20569} [2025-08-18 07:58:42] {'loss': '0.6379', 'loss_video': '0.2362', 'loss_audio': '0.4017', 'step': 10022, 'global_step': 20579} [2025-08-18 08:00:56] {'loss': '0.6341', 'loss_video': '0.2512', 'loss_audio': '0.3828', 'step': 10032, 'global_step': 20589} [2025-08-18 08:03:20] {'loss': '0.7603', 'loss_video': '0.2921', 'loss_audio': '0.4681', 'step': 10042, 'global_step': 20599} [2025-08-18 08:05:38] {'loss': '0.7123', 'loss_video': '0.2756', 'loss_audio': '0.4367', 'step': 10052, 'global_step': 20609} [2025-08-18 08:08:10] {'loss': '0.6731', 'loss_video': '0.2541', 'loss_audio': '0.4190', 'step': 10062, 'global_step': 20619} [2025-08-18 08:10:31] {'loss': '0.6414', 'loss_video': '0.2418', 'loss_audio': '0.3996', 'step': 10072, 'global_step': 20629} [2025-08-18 08:13:12] {'loss': '0.6979', 'loss_video': '0.2738', 'loss_audio': '0.4242', 'step': 10082, 'global_step': 20639} [2025-08-18 08:15:35] {'loss': '0.6648', 'loss_video': '0.2313', 'loss_audio': '0.4335', 'step': 10092, 'global_step': 20649} [2025-08-18 08:17:58] {'loss': '0.6683', 'loss_video': '0.2642', 'loss_audio': '0.4041', 'step': 10102, 'global_step': 20659} [2025-08-18 08:20:01] {'loss': '0.7087', 'loss_video': '0.2875', 'loss_audio': '0.4212', 'step': 10112, 'global_step': 20669} [2025-08-18 08:22:27] {'loss': '0.6771', 'loss_video': '0.2558', 'loss_audio': '0.4214', 'step': 10122, 'global_step': 20679} [2025-08-18 08:24:18] {'loss': '0.6658', 'loss_video': '0.2297', 'loss_audio': '0.4361', 'step': 10132, 'global_step': 20689} [2025-08-18 08:26:46] {'loss': '0.6847', 'loss_video': '0.2699', 'loss_audio': '0.4148', 'step': 10142, 'global_step': 20699} [2025-08-18 08:29:20] {'loss': '0.7481', 'loss_video': '0.2756', 'loss_audio': '0.4725', 'step': 10152, 'global_step': 20709} [2025-08-18 08:31:50] {'loss': '0.6917', 'loss_video': '0.2658', 'loss_audio': '0.4259', 'step': 10162, 'global_step': 20719} [2025-08-18 08:34:00] {'loss': '0.6384', 'loss_video': '0.2583', 'loss_audio': '0.3801', 'step': 10172, 'global_step': 20729} [2025-08-18 08:36:36] {'loss': '0.5889', 'loss_video': '0.2244', 'loss_audio': '0.3645', 'step': 10182, 'global_step': 20739} [2025-08-18 08:39:00] {'loss': '0.6696', 'loss_video': '0.2581', 'loss_audio': '0.4115', 'step': 10192, 'global_step': 20749} [2025-08-18 08:39:06] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-18 08:39:23] Saved checkpoint at epoch 1, step 10193, global_step 20750 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step20750 [2025-08-18 08:39:23] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step20250 has been deleted successfully as cfg.save_total_limit! [2025-08-18 08:41:48] {'loss': '0.6890', 'loss_video': '0.2364', 'loss_audio': '0.4526', 'step': 10202, 'global_step': 20759} [2025-08-18 08:44:26] {'loss': '0.6682', 'loss_video': '0.2467', 'loss_audio': '0.4215', 'step': 10212, 'global_step': 20769} [2025-08-18 08:46:32] {'loss': '0.6531', 'loss_video': '0.2389', 'loss_audio': '0.4142', 'step': 10222, 'global_step': 20779} [2025-08-18 08:49:04] {'loss': '0.6239', 'loss_video': '0.2377', 'loss_audio': '0.3862', 'step': 10232, 'global_step': 20789} [2025-08-18 08:51:25] {'loss': '0.6417', 'loss_video': '0.2571', 'loss_audio': '0.3846', 'step': 10242, 'global_step': 20799} [2025-08-18 08:54:16] {'loss': '0.6866', 'loss_video': '0.2943', 'loss_audio': '0.3923', 'step': 10252, 'global_step': 20809} [2025-08-18 08:56:44] {'loss': '0.6720', 'loss_video': '0.2639', 'loss_audio': '0.4082', 'step': 10262, 'global_step': 20819} [2025-08-18 08:59:19] {'loss': '0.6638', 'loss_video': '0.2745', 'loss_audio': '0.3893', 'step': 10272, 'global_step': 20829} [2025-08-18 09:01:32] {'loss': '0.6486', 'loss_video': '0.2507', 'loss_audio': '0.3979', 'step': 10282, 'global_step': 20839} [2025-08-18 09:03:56] {'loss': '0.6909', 'loss_video': '0.2723', 'loss_audio': '0.4186', 'step': 10292, 'global_step': 20849} [2025-08-18 09:06:18] {'loss': '0.6742', 'loss_video': '0.2543', 'loss_audio': '0.4199', 'step': 10302, 'global_step': 20859} [2025-08-18 09:08:39] {'loss': '0.6744', 'loss_video': '0.2587', 'loss_audio': '0.4158', 'step': 10312, 'global_step': 20869} [2025-08-18 09:11:10] {'loss': '0.6951', 'loss_video': '0.2720', 'loss_audio': '0.4231', 'step': 10322, 'global_step': 20879} [2025-08-18 09:13:34] {'loss': '0.6698', 'loss_video': '0.2480', 'loss_audio': '0.4218', 'step': 10332, 'global_step': 20889} [2025-08-18 09:16:14] {'loss': '0.6975', 'loss_video': '0.2484', 'loss_audio': '0.4492', 'step': 10342, 'global_step': 20899} [2025-08-18 09:18:29] {'loss': '0.6580', 'loss_video': '0.2636', 'loss_audio': '0.3945', 'step': 10352, 'global_step': 20909} [2025-08-18 09:21:07] {'loss': '0.7064', 'loss_video': '0.2716', 'loss_audio': '0.4348', 'step': 10362, 'global_step': 20919} [2025-08-18 09:23:41] {'loss': '0.6432', 'loss_video': '0.2277', 'loss_audio': '0.4155', 'step': 10372, 'global_step': 20929} [2025-08-18 09:25:58] {'loss': '0.6996', 'loss_video': '0.2747', 'loss_audio': '0.4250', 'step': 10382, 'global_step': 20939} [2025-08-18 09:28:12] {'loss': '0.6863', 'loss_video': '0.2537', 'loss_audio': '0.4326', 'step': 10392, 'global_step': 20949} [2025-08-18 09:30:37] {'loss': '0.6742', 'loss_video': '0.2842', 'loss_audio': '0.3900', 'step': 10402, 'global_step': 20959} [2025-08-18 09:32:49] {'loss': '0.7010', 'loss_video': '0.2771', 'loss_audio': '0.4239', 'step': 10412, 'global_step': 20969} [2025-08-18 09:35:20] {'loss': '0.6917', 'loss_video': '0.2724', 'loss_audio': '0.4193', 'step': 10422, 'global_step': 20979} [2025-08-18 09:37:47] {'loss': '0.7036', 'loss_video': '0.2747', 'loss_audio': '0.4289', 'step': 10432, 'global_step': 20989} [2025-08-18 09:40:00] {'loss': '0.7286', 'loss_video': '0.2540', 'loss_audio': '0.4746', 'step': 10442, 'global_step': 20999} [2025-08-18 09:40:07] The model is going to be split to checkpoint shards. You can find where each parameters has been saved in the index located at pytorch_model.bin.index.json. [2025-08-18 09:40:24] Saved checkpoint at epoch 1, step 10443, global_step 21000 to ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step21000 [2025-08-18 09:40:24] ./outputs/audio_video/001-Wan2_1_T2V_1_3B/epoch001-global_step20500 has been deleted successfully as cfg.save_total_limit! [2025-08-18 09:42:57] {'loss': '0.6790', 'loss_video': '0.2627', 'loss_audio': '0.4164', 'step': 10452, 'global_step': 21009} [2025-08-18 09:45:29] {'loss': '0.7264', 'loss_video': '0.2984', 'loss_audio': '0.4280', 'step': 10462, 'global_step': 21019} [2025-08-18 09:48:03] {'loss': '0.6459', 'loss_video': '0.2476', 'loss_audio': '0.3983', 'step': 10472, 'global_step': 21029} [2025-08-18 09:50:43] {'loss': '0.6888', 'loss_video': '0.2761', 'loss_audio': '0.4127', 'step': 10482, 'global_step': 21039} [2025-08-18 09:53:15] {'loss': '0.6783', 'loss_video': '0.2805', 'loss_audio': '0.3977', 'step': 10492, 'global_step': 21049} [2025-08-18 09:55:51] {'loss': '0.6788', 'loss_video': '0.2597', 'loss_audio': '0.4191', 'step': 10502, 'global_step': 21059} [2025-08-18 09:58:30] {'loss': '0.7026', 'loss_video': '0.2746', 'loss_audio': '0.4280', 'step': 10512, 'global_step': 21069} [2025-08-18 10:00:54] {'loss': '0.6404', 'loss_video': '0.2436', 'loss_audio': '0.3968', 'step': 10522, 'global_step': 21079} [2025-08-18 10:03:04] {'loss': '0.7087', 'loss_video': '0.2762', 'loss_audio': '0.4324', 'step': 10532, 'global_step': 21089} [2025-08-18 10:05:24] {'loss': '0.6978', 'loss_video': '0.2456', 'loss_audio': '0.4522', 'step': 10542, 'global_step': 21099} [2025-08-18 10:07:51] {'loss': '0.6791', 'loss_video': '0.2513', 'loss_audio': '0.4278', 'step': 10552, 'global_step': 21109} [2025-08-18 10:10:07] {'loss': '0.7458', 'loss_video': '0.3097', 'loss_audio': '0.4361', 'step': 10562, 'global_step': 21119} [2025-08-18 10:12:21] {'loss': '0.6364', 'loss_video': '0.2340', 'loss_audio': '0.4023', 'step': 10572, 'global_step': 21129} [2025-08-18 10:12:25] Building buckets... [2025-08-18 10:12:29] Bucket Info: [2025-08-18 10:12:29] Bucket [#sample, #batch] by aspect ratio: {'0.38': [73, 9], '0.43': [269, 42], '0.48': [48, 4], '0.50': [82, 7], '0.53': [165, 23], '0.54': [578, 69], '0.56': [94859, 16902], '0.62': [844, 128], '0.67': [2354, 309], '0.75': [34023, 3482], '1.00': [303, 28], '1.33': [268, 23], '1.50': [76, 5], '1.78': [870, 90]} [2025-08-18 10:12:29] Image Bucket [#sample, #batch] by HxWxT: {} [2025-08-18 10:12:29] Video Bucket [#sample, #batch] by HxWxT: {('480p', 81): [8985, 2990], ('480p', 65): [16310, 4073], ('480p', 49): [13049, 3257], ('480p', 33): [7857, 1567], ('360p', 81): [6786, 1352], ('360p', 65): [5654, 937], ('360p', 49): [6556, 1088], ('360p', 33): [7757, 965], ('240p', 81): [12163, 1210], ('240p', 65): [14890, 1234], ('240p', 49): [13749, 1139], ('240p', 33): [21056, 1309]} [2025-08-18 10:12:29] #training batch: 20.63 K, #training sample: 131.65 K, #non empty bucket: 164 [2025-08-18 10:12:30] Beginning epoch 2... [2025-08-18 10:14:17] {'loss': '0.7067', 'loss_video': '0.2600', 'loss_audio': '0.4467', 'step': 5, 'global_step': 21119} [2025-08-18 10:16:46] {'loss': '0.7038', 'loss_video': '0.2977', 'loss_audio': '0.4061', 'step': 15, 'global_step': 21129} [2025-08-18 10:19:12] {'loss': '0.6742', 'loss_video': '0.2449', 'loss_audio': '0.4293', 'step': 25, 'global_step': 21139} [2025-08-18 10:21:59] {'loss': '0.7014', 'loss_video': '0.2799', 'loss_audio': '0.4215', 'step': 35, 'global_step': 21149} [2025-08-18 10:24:36] {'loss': '0.7149', 'loss_video': '0.3035', 'loss_audio': '0.4115', 'step': 45, 'global_step': 21159} [2025-08-18 10:26:49] {'loss': '0.6775', 'loss_video': '0.2725', 'loss_audio': '0.4049', 'step': 55, 'global_step': 21169} [2025-08-18 10:29:11] {'loss': '0.7116', 'loss_video': '0.2851', 'loss_audio': '0.4265', 'step': 65, 'global_step': 21179} [2025-08-18 10:31:38] {'loss': '0.6695', 'loss_video': '0.2564', 'loss_audio': '0.4131', 'step': 75, 'global_step': 21189} [2025-08-18 10:34:03] {'loss': '0.6888', 'loss_video': '0.2533', 'loss_audio': '0.4355', 'step': 85, 'global_step': 21199} [2025-08-18 10:36:41] {'loss': '0.7079', 'loss_video': '0.2954', 'loss_audio': '0.4124', 'step': 95, 'global_step': 21209} [2025-08-18 10:39:10] {'loss': '0.6393', 'loss_video': '0.2623', 'loss_audio': '0.3770', 'step': 105, 'global_step': 21219} [2025-08-18 10:41:33] {'loss': '0.7136', 'loss_video': '0.2704', 'loss_audio': '0.4432', 'step': 115, 'global_step': 21229}